AWS re:Invent 2023 - Keynote with Dr. Swami Sivasubramanian

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[music playing] Please welcome the Vice President of Data and AI at AWS, Dr. Swami Sivasubramanian. [music playing] [applause] Hello, everyone. Welcome to re:Invent 2023. This year, I'm especially excited to stand on this stage and share our vision with all of you. That's because, this year, we are standing at the edge of another technological era, an era in which a powerful relationship between humans and technology is unfolding right before us. Generative AI is augmenting human productivity in many unexpected ways, while also fueling human intelligence and creativity. This relationship, in which both humans and AI form new innovations, is rife with so many possibilities. This partnership is similar to many symbiotic relationships we observe in nature, where different species not only co-exist together, but also, benefit from one another, like we see with the whale sharks and the remora fish. Whale sharks offer the fish protection from its predators, while the fish keeps the shark clean and healthy. There are even symbiotic stars that share gases and heat, occasionally producing a beautiful explosion of energy called a supernova. Today, the world is captivated by generative AI's potential to reinvent the way we work and form new innovations, our own supernovae, if you will, but the exploration of this symbiotic relationship with humans and technology is not really new, and in fact, over the 200 last years, mathematicians, computer scientists, and many visionaries have dedicated their entire lives to inventing new technologies that have really reduced manual labor and automated many complex human tasks, from new machines for mathematical computation and big data processing to new architectures and algorithms to recognize patterns and make predictions to new programming languages that made it significantly easier for people to work with data. One of those visionaries was Ada Lovelace, who's also known as the world's first computer programmer. Ada is often recognized for her work with Charles Babbage on the Analytical Engine, one of the earliest versions of computer in the 1830's. Like Babbage, Ada believed that computers could remove the heavy lifting from many computational and analytical tasks, but what set Ada apart from many of her contemporaries at the time was their ability to recognize the potential of computers, that they go way beyond just glorified number crunching. She discovered that they could be programmed to read symbols and perform a logical sequence of operations. This discovery marked a shift towards using computers for complex tasks. Ada also speculated that these computers could understand musical notations, and they could even create music in the future. This could be called an early ode to the potential of generative AI. But despite her belief in their potential, she made one thing really, really clear. These machines could not really originate anything. They could only generate outputs or perform tasks that humans were capable of ordering them to do. True creativity and intelligence, she argued, only originates from humans. In this way, humans and computers would have a mutually beneficial relationship, with each contributing their own unique strengths. Personally, Ada's analysis of computing is really inspiring to me, not only because her work stood the test of time, but also, because I have a young daughter in STEM. In fact, Ada is one of my daughter's favorite people from the book called Goodnight Stories for Rebel Girls. So, if you ask my daughter, she will tell you that, I have a career right now because of Ada, which I think is really cool. Now, today, it's clear that Ada's contributions are gaining even more relevance as the world of generative AI unfolds. While ML and AI have spurred rapid innovation over the last couple of decades, I believe this new era of human-led creativity with GenAI will shift our relationship with technology towards one of intelligence augmentation. I find this core evolution of humans and technology incredibly exciting, because that is also under the symbiotic relationship happening under the hood, that is key to driving our progress forward here, the relationship between data and GenAI. The massive explosion of data has enabled these foundational models to exist in the first place, the large language models that power generative AI to finally form and accelerate new innovations everywhere. With these new advancements, our customers can harness that data to build and customize these apps with models faster than ever before, while also using GenAI to make it easier to work with and organize that data. So, while we typically think of symbiosis as a relationship between two different things, I like to think of what's happening today as a beneficial relationship among three things, where data, generative AI, and humans are all unleashing their creativity together. Let's explore this relationship further, starting with developing GenAI apps. To build a GenAI app, you need a few essentials. You need access to a variety of LLMs and foundational models to meet your needs. You need a secure and private environment to customize these models with your data. Then you need the right tools to build and deploy new applications on top of these models, and you need a really powerful infrastructure to run these data intensive and ML intensive applications. AWS has a long history of providing our customers with a comprehensive set of AI/ML data and compute stack just for this, and now, according to HG Insights, more machine learning workloads run on AWS than any other cloud provider. We like to think of our AI/ML offerings as a three-layer stack. At the lowest layer is the infrastructure tier. You will need to cost-effectively train these foundational models and deploy them at scale, including our hardware chips and GPUs. This layer also includes Amazon SageMaker, our service which enables ML practitioners to easily build, train, and deploy ML models, including these LLMs. At the middle layer, we have Amazon Bedrock, which provides access to the leading LLMs and other FMs to build and scale generative AI applications. And at the top layer are the applications that help you take advantage of GenAI, without the need for any specialized knowledge or having to write any code. This includes services like Amazon Q, our new GenAI-powered assistant that is tailored to your business. Each of these layers build on the other, and you may need some or even all of those capabilities at different points in your GenAI journey. Let's take a closer look at our tools for building GenAI applications, starting with Amazon Bedrock. Bedrock has a broad set of capabilities for building and scaling GenAI applications with the highest levels of privacy and security. One of the main reasons customers gravitate towards Bedrock is the ability to select from a wide range of leading foundational models that support their unique needs. This customer choice is paramount, because we believe no one model will rule the world. We are still in the early days with GenAI, and these models will continue to evolve at unprecedented speeds. That's why customers need the flexibility to use different models at different points for different use cases. For this, Bedrock has you covered. We offer customers a broad choice of the latest foundation models from leading providers like AI21 Labs, Anthropic, Cohere, stability, and Meta. One of our most popular models is Anthropic's Claude model, which is used for tasks like summarization and complex reasoning. Customers also use Stability.AI's Stable Diffusion model to generate images, arts, and design, and recently, we announced support for Cohere Command, which can be used for tasks like copywriting and dialogue, and Cohere Embed for search and personalization. But there is more. We recently added support for three new Cohere models: Command Light, Embed English, and Embed Multilingual. We also introduced Meta's Llama 2 on Bedrock, which customers are rapidly adopting for high performance and relatively low cost. And just a couple of weeks ago, we added the Llama 2 13B model, which is optimized for a variety of small range use cases, and finally, we added support for Stable Diffusion SD XL 1.0, Stability.AI's advanced text-to-image model, which is now generally available. But then I told you in this space, it's super early, and choice is paramount. So, today, we are continuing our commitment to offering the latest innovation from many of our model providers, starting with Anthropic. I'm excited to announce Bedrock support for Claude 2.1. [applause] Claude 2.1 delivers advancements in key capabilities for enterprises, including an industry leading 200K context window, which also improves your accuracy in many remarkable ways. According to Anthropic, this update has 50% fewer hallucinations, even when you are trying to do adverse real prompt attacks, and it also has a 2X reduction in false statements in open-ended conversations. Both of these are super important for enterprise use cases. And 2.1 also has improved system prompts, which are model instructions that provide a better experience for end users, while also reducing the cost of these prompts and completions by 25%. And we are not just stopping there. We are also excited to announce new updates for our customers who want to experiment with publicly available models. That's why, today, I'm pleased to announce support for Llama 2 70B in Bedrock. [applause] This model is suitable for many large-scale text-based processing, such as language modeling, text generation, and dialogue systems. In addition to our partners, AWS is heavily building new innovations in this area for our customers. We have a longstanding history in AI and ML. Amazon has invested in AI/ML technologies for over 25 years, and many of the capabilities our customers use today are driven by ML models, including these foundational models. These models power virtually everything we do, from our customer-facing e-commerce applications to various facets of our enterprise business and supply chain. Because this ML innovation has benefited our business so much, we wanted to share our key learnings with our customers too. One of the ways we wanted to enable this is by enhancing search and personalization experience for our customers through a specific data type called vector embeddings. Vector embeddings are produced by these foundation models, which translate text inputs, like words, phrases, or large units of text into numerical representations. While we as humans understand text and the meaning and the context of these words, machines only understand numbers. So, we have to translate them into a format that is suitable for machine learning. Vectors allow your models to more easily find the relationships between similar words. For instance, a cat is closer to a kitten, or a dog is closer to a puppy. This means your foundational models can now produce more relevant responses to your customers. Vectors are ideal for supercharging your applications, like rich media search and product recommendation. In this scenario, the use of vector embeddings greatly enhances the accuracy of the query, like bright-colored golf shoes. We have used embeddings to support many aspects of our business, like amazon.com. That's why we offer Titan Text Embeddings, which enables customers, like AI company Grip Tape, to easily translate their text data into vector embeddings for a variety of use cases. But as our customers are continuing to build more and more applications, they want to combine image and text that support both modalities. For example, imagine a furniture retail company with thousands of images. They want to enable their customers to search for a furniture using a phrase, image, or even both. They could use instructions, like, show me what works well with my sofa. Now, to build such kind of experience, developers need to spend time piecing together multiple models. Not only does this increase the complexity of your GenAI stack, but it also decreases the efficiency, and it impacts customer experience. We wanted to make these applications even easier to build. That's why, today, I'm excited to announce the general availability of Titan Multimodal Embeddings. [applause] This model enables you to create richer multimodal search and recommendation options. Now, you can quickly generate store and retrieve embeddings to build more accurately and contextually relevant multimodal search. Companies like OfferUp are using Titan Multimodal Embeddings, as well as Alamy, which is using this model to revolutionize their stock image search experience for their customers. In addition to our embeddings models, we also offer models that support text generation use cases. This includes Titan Text Lite and Titan Text Express, which are now generally available. [applause] These text models help you optimize for accuracy, performance, and cost depending on your use cases. Text Lite is a really, really small model, but it is an extremely cost-effective model that supports use cases like chatbots, Q&A, and text summarization. It is lightweight and ideal for fine tuning, offering you a highly customizable model for your use case. Text Express can be used for a wider range of tasks, such as open-ended text generation and conversational chat. This model provides a sweet spot for cost and performance, compared to these really big models. Now, finally, GenAI image generation is growing in popularity for industries such as advertising and retail, where customers need high quality visuals and at a lower cost. We wanted to help our customers do this easily, accurately, and responsibly. That's why, today, I'm very excited to announce Titan Image Generator, which is now available in preview today. [applause] This model enables customers to produce high quality realistic images or enhance existing images using simple natural language prompts. You can customize these images using your own data to create content that better reflects your industry or your brand. Titan Image Generator is trained on a diverse set of datasets to enable you to create more accurate outputs. It also includes built-in mitigations for toxicity and bias. Through human evaluation, we found that Titan Image Generator has higher scores than many other leading models. More importantly, to build on our commitments we made at the White House earlier this year to promote the responsible development of AI technology, all Titan-generated images come with an invisible watermark designed to help reduce the spread of misinformation by providing a discreet mechanism to identify AI generated images. AWS is among the first model providers to widely release built-in invisible watermarks that are integrated into image outputs and are designed to be tamper resistant. Now, let's take a look at the model's editing features in action. First, I will use the image generator and submit a text prompt, such as image of a green iguana, to get an image quickly to kind of see what I want. Now, I can use the model to easily swap out an existing background to a background of a rainforest. This process is known as out-painting. You can use the model to seamlessly swap out backgrounds to generate lifestyle images, all while retaining the main subject of the image, and to create a few more options, I can use the image playground to generate variations of my original iguana subject, as well as variations of the original rainforest background. Or I can completely change the orientation of the picture from left-facing to right-facing, using the prompt, like orange iguana facing right in a rainforest. This is really cool, right, and there are so many incredible Image Generator features that I cannot cover, such as in-painting and image customizations, that I have not showcased here today. Because this model is trained for a broad range of domains, customers across a variety of industries will be excited to take advantage of Titan Image Generator. As you can see, each Titan model has its own unique strengths across capabilities, price, and performance, and as Adam shared yesterday, we have carefully chosen how we train our models and the data we use to do so. We will indemnify customers against claims that our models or their output infringe on anyone's copyright. With these investments, our customers will have the flexibility to select the best models for their requirements, even as their needs grow and change, and our Bedrock customers have quickly taken advantage of different models to build all types of customer experiences. In fact, since we launched Bedrock, more than 10,000 customers are rapidly developing GenAI-powered applications for use cases like self-service, customer support, text analysis, and forecasting trends. This includes customers like SAP, the world's leading provider of enterprise software, who is using Bedrock to automate the creation of trip requests within SAP Concur, saving employees hours of time, including our own Amazon employees. Georgia Pacific, one of the world's leading manufacturers of paper and pulp, uses Bedrock to power a chatbot system that helps employees quickly retreat critical factory data and answer questions about their machines. And United Airlines, they use Bedrock to help employees access up-to-date information and summaries and delays using natural language, helping them resolve operational issues and customer issues faster. It is so inspiring to see our customers build with these models on Bedrock and adapt them for their needs, and now, to show you how to put GenAI into action for your business, please welcome Nhung Ho, VP of AI from Intuit. [music playing] Good morning, everyone. Great to see you all bright and early today. How are you all feeling? Good? Come on, that's right. So, as I've been listening over the past few days, it's obvious to me why Swami invited us onto this stage. Over the past decade, Intuit has built on AWS, first from moving our application onto the cloud to AI/ML with SageMaker, and now, in the era of generative AI, with Bedrock. At Intuit, everything we do centers around our mission to power prosperity around the world for 100 million consumers and small business customers, and for me, this mission really hits home. I'm one of ten siblings. Can you imagine being in a family with ten siblings? There's so many of us, we don't even all fit in the same photo. This is the most of us that's ever fit in one photo at any one time, and the reason why this hits really close to home for me is that half of my siblings are small business owners, and so, I deeply understand the everyday challenges that small businesses face, from managing inventory to dealing with cashflow to understanding taxes throughout the year, and so, it's really great for me that, in my everyday work, I get to make their lives easier with AI. I get to build game changing applications that solve problems for myself, my siblings, and probably a lot of you in the audience here today. At Intuit, we're all about leveling the playing field for our customers, and so, to do that, we've been on an incredible transformation journey over the past five years. In 2019, we declared that we were going to be an AI-driven expert platform, and by combining cutting edge AI with tax and human expertise, we're delivering unparalleled experiences for our customers. Today, we've been able to achieve incredible scale with AWS running all of our data capabilities as well as our data lake on AWS, and our machine learning platform, of which SageMaker is a foundational capability, allows our entire community of machine learning developers to build, deploy, and ship new AI experiences with speed, and to give you an idea of what this really means more concretely, this means that we're able to make 65 billion machine learning predictions per day, utilizing over half a million datapoints for our small business customers, 60,000 for our consumers, and then driving over 810 million customer-backed AI interactions per year. Now, in the era of generative AI, we're well positioned to change the game because of our multi-year investment. We've been really honed in on making sure that our data is clean, that there's strong data governance, and that we're building out responsible AI principles, and that's really allowed us to quickly unlock these new opportunities. To enable our technologists to design, build, and quickly ship out AI applications in this GenAI world, we built a proprietary GenAI operating system called GenOS, and this is on AWS, and what GenOS has is four primary components. The first is Gen Studio. This is where anybody at the company can build and prototype and test out new-to-the-world GenAI experiences. When they're ready to ship, then they can use Gen Runtime, which has connectivity to a multitude of LLMs and has access to the right underlying data, so that you can build those personalized and accurate experiences but have the comfort that you can scale it out when needed to customers. The next piece is that, when you do deploy and ship these GenAI experiences, you want that consistency across your products. You don't want a Franken experience. So, we built a design system called GenUX, so that you get that transparency, and when a customer interacts with GenAI, they know that they're getting that experience, and I would say the most important component to GenOS is the series of financial large language models, and this is a set of third-party LLMs as well as custom-trained LLMs that are specialized in our domain of tax, accounting, marketing, and personal finance, and you may ask, why the heck do I need to do this myself, why not use what's out there, right, and I'll tell you a couple things that we learned during this journey, is that three things remain constant in the GenAI world: accuracy, latency and cost. Any experience you build will anchor on those three things, and so, data is the key to unlocking accuracy. We all know that, but the ability to use these smaller, faster models allows us to realize significant latency gains, and the great thing is that we're able to host these models on SageMaker, and so, we're able to finally manage cost as well as scale according to our needs, but I also mentioned earlier, we use third party LLMs, because at the end of the day, the thing that you really optimize for is to build the best customer experience possible. To do that, you need to be able to use best-in-class solutions, and that's what Bedrock gives us the ability to do. It gives you optionality. With the wide library of models available on Bedrock that Swami just showed, we're able to fully encompass all of the needs that our customer has, and the other thing that Swami mentioned is that, within Bedrock, we're able to easily scale our underlying inference infrastructure, and all of that is done within our AWS VPC, and so, that gives us the confidence to be able to fully leverage our data and our knowledge base to build these personalized, responsible, and also, relevant experiences for our customers, but also, knowing that the safety, security, and privacy of the data is maintained for our customers. So, in September we launched Intuit Assist, our generative AI assistant that is embedded across all of our product offerings. If you go to TurboTax, you go to QuickBooks, you go to MailChimp, you are going to see Assist, and it's all backed by GenOS. With Assist, what we're really trying to do is help you feel confident in every single financial decision that you make. It's there with you, and so, I'm going to show you what does this looks like in TurboTax. This is live in production for our customers, and we also gather significant customer feedback. So, if you can imagine, when you get to the end of your tax filing experience, you get a number. What does that number mean, right? Like, I have a PhD in natural physics. I barely know what those numbers mean, and so, if you can imagine for the standard everyday user, it's incredibly challenging. By marrying the power of our knowledge engine that ensures accuracy with the power of an LLM, we're able to help unpack this outcome for our customers, so that they truly understand and can feel confident to take that next step, whatever it may be, and so, for those who are just beginning their AI journeys, we offer two learnings. The first is take a holistic approach. Really invest in your underlying data, because that's going to be the differentiator for every single experience that you build, but also, build in horizontal solutions from day one. At some point, demos need to become production experiences, right, and you don't want to get caught by surprise when you're ready to go. The second is that there's no one-size-fits all LLM solution. Optionality is so incredibly important, and that's what is offered on Bedrock and on AWS, and our collaboration with AWS over the past years has really helped us grow to become a global financial technology platform, and these are just some of the services that have gotten us there. So, I agree with Swami, but I have to agree with him, I'm on the stage, that the massive explosion of data has enabled these foundational models, and you saw, Assist is one of the experiences that is an outcome of our ability to leverage data in an LLM. At Intuit, Assist is just going to be one of the many GenAI experiences that we built, and over our 40-year history, we've gone through many transformations. GenAI is just one of those transformations, and we're going to continue to transform and reinvent over the next 40 years. Thank you so much. [music playing] Thank you, Nhung. Intuit is an excellent example of how you can reimagine your customer experience with easy-to-use tools and access to a variety of these foundational models, and as Nhung demonstrated, there is another component that is critical for creating these GenAI apps that are unique to your business. That is your data. When you want to build GenAI applications that are unique to your business, your data is the differentiator. Data is the key from a generic AI application to a GenAI application that understands your customer and your business. So, how do you go about customizing these models with your data? A common technique to customizing these foundational models is called fine tuning. With fine tuning, it's pretty simple. You provide a labeled dataset, which are annotated with additional context to train the model on specific tasks. You can then adapt the model parameters to your business, extending its knowledge with lexicon and terminology that are unique to your industry and your customers. Amazon Bedrock removes the heavy lifting from the fine-tuning process, but you can also leverage unlabeled datasets or the raw data to maintain the accuracy of your foundational model for your domain, through continual pre-training process. For example, a healthcare company can continue to pre-train the model using medical journals, articles, or research papers to make it more knowledgeable on the evolving industry terminology. Today, you can leverage both of these techniques with Amazon Titan Lite and Titan Express. [applause] These models complement each other and will enable your model to understand your business over time, but no matter which method you use, the output model is accessible only to you, and it never goes back to the base model. We announced some of these capabilities early on. This week, we also added fine tuning in Bedrock for Cohere Command and Llama 2, with fine-tuning for Anthropic Claude coming soon. Now, let me just show you a quick example of how this works. Imagine a content marketing manager that needs to come up with a fresh ad campaign for a new line of shoes. To do this, they select Llama 2 model and provide Bedrock with a few examples of their best performing campaigns. Bedrock makes a separate copy of the base model that is accessible only to the customer, and after training, Bedrock generates relevant social media content, display ads, and web copy for the new shoes. With fine tuning, you can build applications that are specific to your business, but what is some of your data changes frequently, like inventory or pricing does? It is simply not practical to be constantly fine tuning and updating this model while it is also serving user queries. That's why, to enable a model with the up-to-date information from your data sources, you need a different technique called Retrieval Augmented Generation, also known as RAG. With RAG, you can augment the prompt that is sent to your foundational model with contextual information, such as product details, which it draws from your private data sources. This added context in the prompt helps the model provide more accurate and relevant response to the user's query. However, implementing these RAG-based systems is extremely complex. Developers must first convert their data into vector embeddings. Then they need to store these embeddings in a vector database that can handle vector queries efficiently. Finally, they build custom integrations with Vector database to perform semantic searches, retrieve relevant text, and then augment the prompt. All of this process can take weeks if not months to build. To make this process easier, that's why, yesterday, we announced Knowledge Bases for Amazon Bedrock, which supports the entire RAG workflow right from ingestion to retrieval to prompt augmentation. Here, you simply point to the location of your data, like an S3 bucket, and Bedrock fetches relevant context and relevant text documents, converts them into embeddings, and stores it into your vector database, and during inference time, it augments with the right context to your prompts sent to your foundational models. Knowledge Bases work with popular vector databases, like our vector engine for open set serverless, Redis Enterprise Cloud, and Pinecone, and coming soon, we will also add support for Amazon Aurora as well as MongoDB, with more and more databases being added over time. Now, the ability to customize these models with your data is incredibly useful, but you can also extend the power of these models to execute business tasks, like booking travel or processing insurance clients. To do this, developers perform several resource-intensive steps to fulfill a user request, like defining the instructions and orchestrating or configuring the models to access your data sources. That's why yesterday, Adam announced the GA of Agents for Amazon Bedrock, a capability that enables GenAI applications to execute complex tasks by dynamically invoking these APIs. Bedrock makes it super easy to create these fully managed agents that connect your internal systems and APIs on your behalf in just a few steps. Now, now that we have talked through different ways to customize your model and remove the heavy lifting with agents, let me walk through a hypothetical scenario on how to leverage GenAI capabilities for a task that many of us are familiar with: DIY. How many of you have a home improvement project on your to-do list? I see a few hands. My wife and I did a lot of work in our basement this summer, and believe me, getting that work done was a full-time job. Any new DIY project requires multiple complex steps. Oftentimes, one of the hardest things about the project is just figuring out how to get started. To help with these challenges, we have built a hypothetical DIY business, called Rad DIY, and this is powered by a GenAI-powered assistant with Claude 2 on Amazon Bedrock. This assistant is designed to remove the complexities of a DIY project and provide customers with accurate and easy-to-follow steps. Let's see how it works. Nina is an ambitious DIYer, who wants to replace her bathroom vanity and decides to use the app for her project. She can use natural language to ask the assistant about any type of project and receive a list of detailed steps, materials, and tools, along with any necessary permits. The app also leverages customer inputs to generate images of their project using Titan Image Generator model. So, after a short interaction, the app provides Nina with a few images for inspiration that she can further refine through conversation and feedback. Once Nina selects the design she likes, our app uses Multimodal Embeddings to search its extensive inventory and retrieve all of the products she will need. No multiple trips to the store will be necessary. In addition, now, the app uses the Cohere Command model to provide a summary of user reviews for each reviewed product. This summarization helps Nina decide if the products and the tools meet her requirements and skill level. Finally, if Nina wants to find a specific item for her vanity, like nautical bronze drawer handles, our app uses the Knowledge Base feature in Bedrock to search inventory for products that meet her budget and skill level and timeframe. Now that Nina has everything she needs, all that is left for her to do is start her project. I hope this hypothetical scenario sparks some ideas for you on how you can use Bedrock to build GenAI applications with your data to create new customer experiences. We make it easy to get started on Bedrock, but some of our customers also want hands-on support to get started with GenAI. That's why we offer the AWS GenAI Innovation Center, a program that pairs your team with our own expert AI/ML scientists and strategy experts to accelerate your GenAI journey. Since we announced it, this program has been gaining incredible momentum. Many of our customers also told us they want dedicated support to customize these foundational models for their needs, which is why we are introducing even more offerings through our Innovation Center. Today, I'm excited to announce a new Innovation Center Custom Model Program for Anthropic Claude. [applause] This program, available early next year, will be incredibly powerful, because it will enable you to work with our team of experts to customize these highly powerful cloud models for your business needs with your data. This includes everything from scoping requirements to defining evaluation criteria to working with your proprietary data for fine tuning. You can then securely access and deploy your private models on Bedrock, which will be available only in your VPC. However, customizing these foundation models isn't the only way to build innovative AI applications. As I mentioned earlier, there may still be a need for certain companies to build their own, and these customers need powerful machine learning infrastructure. For instance, AWS has partnered with NVIDIA for 13 years to deliver large scale, high performance GPU solutions that are widely used for deep learning workloads, and this week, we announced an expansion of a strategic collaboration to deliver next generation infrastructure, software, and services for generative AI. And to provide more choice for our customers, we have invested in our own ML chips, AWS Trainium and AWS Inferentia, to push the boundaries on cost efficiency and performance. We also enable our customers with best-in-class software tools in the software layer of the stack with Amazon SageMaker. SageMaker makes it easy for customers to build, train, and deploy ML models, including these LLMs, with tools and workflows for the entire ML lifecycle, right from data preparation to model deployment. We have also invested in providing efficient model training with distributed training libraries and built-in tools to improve model performance, and today, leading organizations like Stability.AI, AI21 Labs, Hugging Face, and TII are training their foundational models on Amazon SageMaker. But with all of these investments in this area, training a foundation model can still be incredibly challenging. First, customers need to acquire large amounts of data, create and maintain a large cluster of accelerators, write code to distribute model training across a cluster, frequently inspect and optimize the model, and manually remediate any hardware issues. And all of these steps require deep ML expertise. Let me dive into some of these challenges to understand why it is so complex. Now, because of the massive size of these foundation models and the datasets used for training, developers need to split that data into chunks and load them into the individual chips in a training cluster, a distributed cluster with hundreds or even thousands of accelerators. This is a lot of work, because in order to make efficient use of these compute and network resources, the distribution needs to be tailored to the characteristics of the data, your model architecture, as well as the underlying hardware configurations. That means you have to write a lot of code and optimize it frequently. In addition, customers need to frequently pause and inspect the model performance, optimize the code if something is not working right. To do this, they have to manually take checkpoints of the model state, so that the training is able to start without any loss in progress. Finally, when any of these thousands of accelerators in the cluster fails, the entire training process is halted. To resolve this issue, customers had to identify, isolate, repair, and recover the faulty instance or change the configuration of the entire cluster, further delaying the progress. We wanted to make it easier for our customers to train these LLMs without interruption or delays. That's why, today, I am thrilled to announce the general availability of SageMaker HyperPod. [applause] This one is a big deal, because it's a new distributed training capability that can reduce model training time by up to 40%. HyperPod is pre-configured with SageMaker's distributed training libraries. This enables your data and models to efficiently distribute across thousands of chips in the cluster and process them in parallel. The HyperPod helps customers iteratively pause, inspect, and optimize these models, because it automatically takes checkpoints frequently, and if a hardware failure occurs, it detects the failure. It replaces the faulty instance and resumes the training from the last-day checkpoint. With this new capability, customers will see dramatic improvements by training models for weeks, if not months, without any disruption. But this is just one of the many innovations we announced for SageMaker this week. Today, we are announcing a slew of new SageMaker features across inference, training, and ML ops. [applause] I had to extend by an hour if I had to cover all of it, but I'll just do a quick hit and see. SageMaker Inference reduces model deployment by 50% on average and achieves better latency by 20%. We also introduced new capabilities in SageMaker Studio, like a new user experience. And all of these updates help customers build, train, and deploy these new large language models even easier. I encourage you to check out Bratin Saha's Innovation session later today to learn more about these innovations, and now, I'd like to introduce one of the customers who is leveraging some of these latest SageMaker innovations and training their own, training and deploying their own models. Please join me in welcoming Aravind Srinivas, CEO and Co-Founder of Perplexity, to the stage. [music playing] At Perplexity, we strive to be the world's leading conversational answer engine that directly answers your questions with references provided to you in the form of citations. Our company is re-imagining the future of search by trying to take us from ten blue links to personalized answers that cut through the noise and gets to exactly what you want. Perplexity's Copilot is an interactive search companion. As you see, it starts with a general question that you had in your mind, digs deeper to clarify your needs, and after a few interactions with you, gives you a great answer. Ours is the first global publicly deployed example of generative user interfaces that reduces the need for prompt engineering. This is such a complex product to run and a hard problem to solve. Hence, why we decided to go all in on AWS. We started off by testing frontier models, like Anthropic's Claude 2 on AWS Bedrock. Bedrock provides cutting edge inference for these frontier models. This helped us to quickly test and deploy Claude 2 to improve our general question answering capabilities by providing more natural-sounding answers. Claude 2 has also helped inject new capabilities into Perplexity's product, like the ability to upload multiple large files and ask questions about their contents, helping us to be the leading research assistant there is in the market, but Perplexity is not just a wrapper on top of closed proprietary large language model APIs. Instead, we orchestrate several different models in one single product, including those that we've trained ourselves. We built on top of open-source models like Llama 2 and Mistral and fine-tuned them to be accurate and live, with no knowledge cut off, by grounding them with web search data using cutting edge RAG. This is when we started working with the AWS Startups team on an Amazon SageMaker HyperPod POC. SageMaker HyperPod makes it easier to debug large model training and handle distributed capacity efficiently. We obtained AWS EC2 p4de Capacity for training. This enabled us to fine tune state-of-the-art open-source models, like Llama 2 and Mistral, and once we moved to HyperPod and enabled AWS Elastic Fabric Adapter, we observed a significant increase in the training throughput, by a factor of 2X, but it's not just training that we've benefitted from AWS. AWS has also helped us with customized service to support our inferencing needs, especially on p4d and p5 instances, and this helped us to build top of the market APIs for our open-source models and our in-house models that have been fine-tuned for helpfulness and accuracy. So, today, we are excited to announce the general availability of all these models in the form of APIs, including the first of its kind live LLM APIs, that have no knowledge cutoff and are plugged into our search index, all fully hosted on AWS. [applause/cheering] Thank you. [applause] Generative AI is still in its nascent stages, and we still think we are the beginning of what's going to be a glorious revolution for all of us, where the biggest winners are going to be you all, the consumers of the technology, where you get plenty of choices, great new product experiences, and competitive pricing. Perplexity is closing the research to decision to action loop even further, and we plan to get all our users to a point where you all take this for granted in the years to come. This is disruption and innovation at its prime. Perplexity strives to be the earth's most knowledge-centric company, and we are glad here to be working with AWS, so that no one here ever needs to go back to the ten blue link search engine. Thank you. [applause] [music playing] Wow! Thanks to Aravind for sharing how Perplexity is re-imagining search with new model innovations on AWS. As you saw from Perplexity and other examples so far, it is critical that you are able to store, organize, and access high-quality data to fuel your GenAI apps, whether you're customizing your foundation model or building your own. To get high quality data for GenAI, you will need a strong data foundation, but developing a strong data strategy is not new. In fact, many of you already have made strategic investments in this area, from databases that deliver data to your applications to BI tools that support fast data-driven decision making. GenAI makes this data foundation even more critical. So, what should your data foundation include, and how does it evolve to meet the needs of generative AI? Across all types of use cases, we have found that a strong data foundation includes a comprehensive, integrated set of services, as well as tools to govern your data across the end-to-end data workflow. First, you will need access to a comprehensive set of services that account for the scale, volume, and the type of use cases that you deal with in data. This is where AWS offers a broad set of tools that enable you to store, organize, and access various types of data via the broadest selection of database services, including relational databases, like Amazon Aurora and Amazon RDS. We also offer eight non-relational databases, including Amazon DynamoDB, and places to store and query data for analytics including AI and ML on top of all the S3-based data lakes, and Amazon Redshift, our data warehouse that provides up to six times better price performance than any other cloud data warehouse. You also need tools to act on your data. We have already discussed tools for ML and GenAI, but you also need services to deliver insights from your data, like Amazon QuickSight, our unified BI service, and you need to catalog and govern your data with services that help you centralize access controls. Across all of these areas, AWS provides you the right tool for the job, so you don't have to compromise on performance, cost, or results, and we have carried this philosophy to your GenAI needs as well, including the tools you use for storing, retrieving, indexing, and searching these vector embeddings. As our customers use vectors for GenAI applications, they told us they want to use them in their existing databases, so that they can eliminate the learning curve associated in terms of picking up a new programming paradigm: tools, APIs, and SDKs. They also feel more confident that the existing databases, that they know how it works, how it scales, and its availability can also evolve to meet the needs of vector databases, and more importantly, when your vectors and business data are stored in the same place, your applications will run faster, and there is no data sync or data movement to worry about. For all of these reasons, we have heavily invested in adding vector capabilities to some of our most popular data sources, including Amazon Aurora, Amazon RDS, and OpenSearch Service. And earlier this year, we announced vector engine support for OpenSearch Serverless, and since we announced in preview, this has been rapidly gaining in popularity with our customers. They're loving this truly serverless option, because it removes the need to manage servers for ingestion of your data and querying of your data, and today, I'm pleased to announce our Vector Engine for OpenSearch Serverless is generally available. [applause] Now, we are just getting started there. Not only have we added vector support for these services, but we are also invested in accelerating the performance of existing ones. For example, Aurora Optimized Reads can now support billions of vectors with 20X improvement in queries per second performance with single-digit millisecond latency, and we are continuing to invest in ongoing performance improvements in these areas. Now, there are a bunch of customers who are storing that data in document databases or in key value stores, like DynamoDB. That's why today, I'm pleased to announce vector capabilities in two of our more popular databases: DocumentDB and DynamoDB. [applause] For use cases that need high schema flexibility or JSON data, DocumentDB customers can now store their source data and their vector data together in the same databases, and DynamoDB customers can access vector capabilities through a zero-ETL integration with Amazon OpenSearch, but we didn't want to stop there. In 2021, we added one more purpose-built data store, Amazon MemoryDB for Redis, our Redis-compatible, durable, in-memory database service for ultrafast performance. Our customers asked for an in-memory vector database that provides millisecond response time, even at the highest recall and the highest throughput. This is really difficult to accomplish, because there is an inherent tradeoff between speed versus relevance of query results and throughput. So, today, I'm excited to announce Vector search is now available in preview for MemoryDB for Redis. [applause] MemoryDB customers get ultrafast vector search with high throughput and concurrency, and they can store millions of vectors and provide single-digit millisecond response time, even with tens of thousands of parties per second, at greater than 99% recall. This kind of throughput and latency is really critical for use cases like fraud detection and real-time chat bots, where every second counts. For example, a bank needs to detect in real-time to mitigate losses, and customers want immediate response from chatbots. And finally, we also know that many of our customers are leveraging graphs for storing, traversing, and analyzing interconnected data. For example, a financial company uses graph data to correlate historical account transactions for fraud detection. Since both graph analytics and vectors are all about uncovering the hidden relationships across our data, we thought to ourselves, what if we combined vector search with the ability to analyze massive amounts of graph data in just seconds. And today, we are doing just that. I'm very, very excited to announce the general availability of Neptune Analytics. [applause] An analytics database engine for Neptune, which makes it easier and faster for data scientists and ad developers to quickly analyze large amounts of graph data. Customers can perform graph analytics to find insights and graph up to 80X faster by analyzing their existing Neptune graph data or their data lakes on S3. Neptune Analytics makes it easier for you to discover relationships in your graph with vector search by storing your graph and vector data together. In addition to using this relationship information directly, you can also use it to augment your foundation model prompts through RAG. Snap, an instant messaging app with more than 750 million monthly active users is using Neptune Analytics to perform graph analytics on billions of connections in just seconds to enable friend recommendations in near-real-time. We are thrilled to add all these vector capabilities across our portfolio to give our customers even more flexibility as they build their GenAI applications. We expect our innovation velocity in this area to rapidly accelerate, as new use cases emerge and flourish. Now, let's look at the second pillar of the strong data foundation. You will want to make sure that your data is integrated across data silos, so that you get a complete view of your business. You will want to ensure your data is readily accessible for your GenAI apps. When you break down the data silos across your databases, data leaks, data warehouses, and third-party data sources, you will be able to create better experiences for your customers. We know that building and managing these ETL data pipelines has been a traditional pain point for our customers, and one of the ways we are helping our customers create a more integrated data foundation is our ongoing commitment to a zero-ETL future. Since we announced this vision last year, we have heavily invested in building seamless integrations across our data stores, like our fully managed zero-ETL integration between Aurora, MySQL and Redshift, which we announced earlier this year. This integration makes it easy to take advantage of near-real-time analytics, even when millions of transactions happen in a minute, and you can analyze in Aurora, that you can analyze in Redshift, and yesterday, we announced even more zero-ETL integrations to Redshift, including Aurora Postgres, RDS for MySQL, and DynamoDB. We also announced a new zero-ETL integration with DynamoDB and OpenSearch Service, enabling you to search and query large amounts of operational data in near-real-time. Today, tens of thousands of customers use Amazon OpenSearch to power, real-time search, monitoring, and analysis of business and operational data, but if you look at it, most customers do not store all of their data in OpenSearch and often prefer to pull in data from a variety of logs on data sources like S3, which provides a low-cost and flexible option for storing their data. As a result, they have to create ETL pipelines to OpenSearch to query that data and get actionable results as fast as possible. For example, your operations team might want to query the last 90 minutes of data to troubleshoot an ongoing performance issue in your application. However, if your ETL jobs run nightly, the data your team's need right now won't be available. This slows down your ability to act and creates risk that a smaller issue might end up becoming a really big one. That's why, today, I am delighted to announce a new zero-ETL integration between Amazon OpenSearch and S3. [applause] This one is a big one for observability, because it enables you to seamlessly search, analyze, and visualize all your log data in a single place without creating any ETL pipelines. To get started, you just use OpenSearch Console to set up a data connection and run your queries. It's really that simple. With new indexing capabilities and materialized views, you can accelerate your queries and dashboarding capabilities. Teams can use a single dashboard to investigate observability and security incidents, eliminating the overhead from managing multiple tools. You can also perform complex queries for forensic analysis and correlate data across multiple sources, helping you protect against service downtime and security events. With all of these zero-ETL integrations, we are making it even easier for you to get relevant data for your applications, including GenAI applications. Finally, your data foundation needs to be secure and governed to ensure that the data you use throughout the development of your application stays high quality and compliant. To help with this, we announced Amazon DataZone last year. DataZone is a data management service that helps catalog, discover, share, and govern your data across your organization. It makes it really easy for employees to discover and collaborate with your data to drive insights for your business. DataZone is used by companies like Guardant Health to help the developers focus more on their mission to building cancer solutions instead of worrying about building these governance platforms. Now, while DataZone helps you share data in a governed way within your organizations, many customers want to securely share that data with their select partners. For that, we have AWS Clean Rooms, which makes it easier for customers to analyze data with the business partners without having to share their whole dataset. This capability enables companies to safely analyze collective data and generate insights that couldn't be produced on their own, but customers told us that they want to do more than just run analytics. They want be able to run machine learning to get predictive insights on their Clean Room. One of the ways they want to accomplish this is through what is known as lookalike models, which take a small sample of customer records to generate an expanded set of similar records with a partner, all the while protecting your data. For example, imagine an airline that can take signals about its loyal customers, collaborate with an online booking service, and offer promotions to new but very similar uses. So far, this process has been extremely difficult to accomplish without one party sharing data with the other. That's why, today, I'm introducing the preview of AWS Clean Rooms ML. [applause] This is a first-of-its-kind capability to apply ML models with your partners without sharing the underlying data. With Clean Rooms ML, you can train a private lookalike model across your collective data, keep control of your models, and delete them when you're done. These models can be easily applied in just a few steps, saving you months of development time and resources. Clean Room ML also offers intuitive controls to tune these model outputs based on your business needs. Lookalike modeling is available now, but this is just the first of many models we will be introducing soon. We also plan to introduce modeling for healthcare in the coming months too. Now, I know that our next speaker has a lot of experience using their data to build new innovations for our customers, including applications that are powered by GenAI. Please join me in welcoming Rob Francis, SVP and CTO for Booking.com. [music playing] Thank you, Swami. It's great to be here with all of you. When I first started thinking about what I would want to share with this group today, I was reflecting a little bit on the arc of technology over my own career, and I was thinking about my first attempts to have a kind of human-like interaction with technology. It reminded me of Eliza. If anybody remembers this one, this was developed by Joseph Weizenbaum in the mid-60s, for the EMAX users in the group. I thought to myself, why not just fire up Escape X Doctor and ask Eliza what I should talk about today. So, as you can see, it's fairly frustrating. For those of you who are not familiar with Eliza, it basically just put the question back on you, and you really got nowhere at all, but I'm very excited about what's possible today for our customers at booking.com with the emergence of generative AI, but first, let me tell you a little something about booking.com. If you're familiar with us, you probably think of us from a travel perspective, and it's true. We have accommodations, flights, rental cars, attractions all over the globe, but we're a two-sided marketplace, and we have partners all over the globe that help make our connected trip possible, but let me give you a sense of some of our scale. In the accommodation space alone, we have over 28 million listings of places to stay. If you want to book a flight, you can choose a flight in 54 countries. You might need a rental car in the location. You can choose from over 52,000 rental car locations across the globe, and if you'd like to find something to do, you can book an attraction before or in the app. As you can imagine, that presents a lot of data challenges for us in the form of we manage over… it's over 150 petabytes of data, and several years ago, we recognized that we need some help. So, we partnered with Amazon and AWS to help us tackle some of these challenges, and I'm happy to say, it worked. Our data scientists are thrilled to see that the number of jobs they can train concurrently has gone up by 3X. They have a decrease of 2X in the number of jobs that failed, largely due to limitations of our own infrastructure, but certainly, their favorite, it's a 5X reduction in the time it takes to train their jobs, and I should note that some of these jobs are trained on over three billion examples, but we didn't partner with AWS just for our data strategy. We wanted a partner who was going to help us with the emergence of new technology. So, when generative AI really hit this year, we set out to build the AI trip planner, making it much easier to book a trip with Booking.com in a conversational manner. Let me show you how it works. First, I'd like to point out that we are big fans of open source at Booking.com and we felt that the Llama 2 model was the perfect one for us to implement our intent detection model. So, we started there with hosting Llama as a SageMaker endpoint. Now, if you notice on the left-hand side, I entered into our AI trip planner, what I would just have as a conversation, I'm going to a conference in Las Vegas. I don't really like to gamble, but I do like good food. Where should I stay? Well, the first thing you have to realize is that we have to do a little bit of moderation first. First, we have to make sure that the conversation is even related to travel. That isn't always the case, but more importantly, one of the things that we learn is that our customers tend to put their personally identifiable information in the trip planner itself, and we always want to protect our customer's privacy. So, we want to make sure that we're stripping those sorts of things out first, but there's more to it than that. One of the things that I tend to talk about when I talk about Booking.com with our customers is our great selection, our flexibility, and great price, but they always tell us that they love our reviews. There's just years of data there that really helps them make a good decision. So, our RAG implementation, leveraging AWS technology, pulls in our review data and helps make it easier for our travelers to make a decision. Lastly, we ask the LLM to then populate a JSON object that speaks directly to our recommendation engine, also powered by machine learning, and give them the best choice possible and a nice little carousel that they can sweep through and book right in the app, and we're really just getting started. As you can see, we're already making use of SageMaker. We're working closely with the Bedrock teams on a couple new exciting things coming up, and we're also working with the Titan teams, but I thought to myself for today, why not cover the whole arc, and maybe I should ask Titan what I should have said today. So, let's see what Titan had to say. Good morning, everyone. Thank you, Swami, for the introduction. One, travel booking is a large market with lots of data, but until now, it has been very difficult to use that data to personalize the booking experience. Very true. Two, generative AI is a new technology that can take all of that booking data, learn from it, and then generate new content that is personalized to each individual user. You saw that. Three, at Booking.com, we use GenAI to create personalized hotel recommendations that are tailored to each user's unique needs and preferences. Want to book a trip? Visit Booking.com. Wow, what a difference! How about that for an arc of technology? Thank you very much. [music playing] Thanks, Rob. This is a great example of how you can leverage data to build GenAI apps that provide a truly customized user experience. So, it's clear that data is the fuel for GenAI, but for this to be a true symbiosis, GenAI must also benefit our data foundation. While we typically think about ML and AI as an outcome of data, we can also use it to transform the way we manage data. This means AI can actually enhance the data foundation that fuels it. AWS has infused AI and ML across many of our data services to remove the heavy lifting from data management and analytics. However, with all the strides we have made, managing and getting value out of data can still be challenging. Some of our customers are even asking if they can leapfrog their data strategy with GenAI. The truth is: while GenAI still needs a strong foundation, we can also use this technology to address some of the big challenges in data management, like making data easier to use, making it more intuitive to work with, and making it more accessible. One area where we can apply AI is optimizing the performance of the places where you store and query your data, like your data warehouses. Data warehouse administrators need to manage multiple dimensions, like data variability, the number of concurrent uses, and the query complexity, all the while, having to balance price and performance. These multivariant optimizations are extremely hard for humans, but machine learning algorithms are really incredible, and they excel in them. That's why, earlier this week, we launched AI driven scaling and optimizations for Redshift Serverless, enabling you to proactively scale on multiple dimensions at the same time, all the while, managing the price and performance. The end result is queries just run faster with the optimal price/performance tradeoff, and customers are seeing big benefits. These AI-driven scaling and optimizations will help Honda deliver better price performance and get actionable insights from millions of vehicle datapoints that are loaded into the data warehouse without manual intervention. In addition to optimizing the data warehouse, we know there are other ways we can support data management with AI, like with Amazon Q. As I mentioned earlier, Amazon Q is a new type of GenAI-powered assistant that is tailored to your business. Q supports virtually every area of your business by connecting to your data for context of your role, internal processes, and governance policies. You can ask Q in natural language to receive actionable information that removes the heavy lifting from many repetitive tasks. For instance, when we ran an internal poll within Amazon, asking developers, how did they spend their time, they told us a large portion of the day is focused on things like looking up documentation, building and testing new features, maintenance and upgrades, and troubleshooting. We know Q could make these tasks easier, which is why it's now in all the places you build software and work with AWS. And because Q knows you and your business, it also helps you manage your data and take the heavy lifting out of common data-related tasks, like running data queries. We know that GenAI is very good at translating natural language to code. So, we thought to ourselves, why not leverage Amazon Q to support the coding language of our customers to query data warehouses on a daily basis: SQL? To run a data query, you typically have to write SQL code to load and analyze petabytes of structured and semi-structured data. This is where we offer tools like Redshift Query Editor for builders who write SQL, including detailed detections during table creation, auto complete suggestions, and syntax validation. Our querying workloads can still present challenges. What seems like a really simple request, such as identifying which venues sold the most tickets in a given timeframe or location, actually involves a complex SQL query that takes a lot of trial and error. To remove the heavy lifting from data querying, I'm excited to announce the preview of Amazon Q in Redshift. [applause] If you need help creating custom SQL, Q can turn your natural language prompts into customized recommendations. It's available natively as part of the Redshift Query Editor. With Q, you can ask in plain English something like which three venues sold the most tickets? The underlying model then analyzes the schema and produces a SQL query recommendation in just seconds. We can then add to our notebook and run the query to test it out. Here, it knows the information on ticket sales can be found in the sales table, but it also knows to search for the event table to find the venue. We can then ask Q to find out which event types were the most popular at those venues. However, we can see that these results are based on the total number of tickets, which is not what we want here. So, we can quickly use Q to course correct and ask it to retrieve the data based on the number of total events, and for even more accurate and relevant recommendations, you can enable query history access to specific uses without compromising your data privacy. In addition to data querying, you can also solve some of the more painful data management jobs, like building your data pipelines. While our zero-ETL integrations can eliminate data pipelines between many of your data sources, we recognize that many of our customers still have to write custom ETL jobs that they need to create and maintain constantly. Q can simplify this process for our customers too. Today, I'm pleased to announce that, coming soon, you will be able to use Q for creating data integration pipelines using natural language. [applause] With this new integration, you can build data integration jobs faster. You can troubleshoot them with a simple chat interface and get instant data integration help. Now, let's take a quick example. You can ask Q something like: read this data from S3, drop the null records, and write the results to Redshift, and it will return an end-to-end data integration job to perform this action. Think about how powerful this is. Under the hood, this integration uses Agents for Amazon Bedrock to break down the prompt into a specific set of tasks and then combine the results to build these integration jobs. Q delivers an intuitive interface for integrating your data without you having any prior knowledge of Amazon Glue. Now that we have covered all the elements of a strong data foundation, let's see how they can come together to spur net new innovation. For this, please welcome Shannon Kalisky, Senior Product Manager at AWS, to the stage. [music playing] Thank you, Swami. As humans, we are wonderfully curious creatures, and that desire to learn and share has given rise to more data than we've ever had before, and every 'aha' moment, every experience becomes a story. So, how ironic is it that two of the toughest challenges we face in our day-to-day jobs is just getting the data we need where and how we need it, and then using that data to tell a story? It should be simple. Most of us took flights to be here today, and some of you may have encountered a delay or a cancellation along the way, and when that happens, ugh, it is miserable. So, what if we could change that? Let's imagine we want to create a new feature, something that knows who you are and the situation that you're in and can allow you to easily rebook a flight without stress, but first, we have to wrangle the data, and there is a lot of it in a lot of formats, and it is all over the place. In S3, we have data on baggage information, where every time the status of a bag changes, a new file's created. In Redshift, we have data on passenger details, flight schedules, and aircraft availability. Then in Aurora, we have information on payment methods and customer preferences, and then we have real-time data, like weather, coming in as Amazon Kinesis streams. To bring all of that data together with a traditional ETL pipeline could take weeks, if not months. So, instead, we'll use the zero-ETL integrations across AWS to bring all of that data together without building a pipeline and without writing code. The first thing we'll do is to create a Redshift Serverless data warehouse, and this will give all of our data a place to land. Through zero-ETL, our data on S3 is automatically replicated into Redshift. So, the next time we get one of those baggage updates, we will see it in Redshift instantly. Now, some of our data was already in Redshift, and for that, we can use data sharing to share across the warehouses. There's also a zero-ETL integration between Aurora and Redshift, and that will allow us to get the data from Aurora into Redshift the minute it is written, and finally, we can use Redshift Streaming Ingestion to bring in the real-time data streams from Kinesis, and with that, we have all the data we need, and our new feature can take flight. Now, we need to measure success, and to do that, we will use Amazon Q in QuickSight, where key metrics and critical data come together. Using Q, I can create an executive summary of this dashboard, which will show me the most important insights. For example, I see that, since we have created our new feature, the time to rebook a flight has decreased dramatically. It's pretty amazing, and I want leadership to see the impact that that feature is having. So, I'll use Q to create a data story. All I do is select the format and then tell Q what I want to cover, select the visuals, and then build, and in seconds, I have a beautiful story based on my actual business data. It covers the problem and the impact to customers, just like I asked, and the best part is that it is completely customizable. For example, I can take this paragraph on recommendations, and I can use Q to transform it into bullets, so that those key takeaways are more obvious, and when I'm ready, I can securely share this story to others throughout my organization, so that we can all use it to drive towards better decision making, and just like that, we've taken down two of the toughest challenges we face in our day-to-day jobs. I hope that you all stay curious, and that these tools help you the next time you are knee-deep in data or struggling with writer's block. Thank you. [applause/music playing] Thank you, Shannon. With a strong foundation, you can quickly connect to your data, make strategic decisions, and build new experiences that improve customer loyalty and satisfaction. So, we have covered how data supports GenAI, and how GenAI supports data, but how do we as humans fit into all of this? Recently, I had the honor of presenting with the CEO of Hurone AI at the UN General Assembly event in New York, where we shared how GenAI can make an enormous impact in the world. Dr. Kingsley built this company with one mission critical belief: that where you live should not determine if you live or die of cancer, whether you are in Latin America or in Africa. Across the Sub-Saharan Africa, there is roughly one oncologist for every 3,200 cancer patients. By comparison, the U.S. ratio is about one oncologist for every 300 patients, and in particular, the country of Rwanda, with a population of about 13 and a half million people have fewer than 15 oncologists. That means Rwandan patients make arduous commutes to medical facilities and often wait until symptoms are dangerously severe before even reporting them. To combat this problem, Hurone AI created applications that helped make the best possible cancer care accessible to everyone regardless of their location, and with innovations they are building on Bedrock, Hurone AI is revolutionizing cancer care detection and diagnosis predictions to improve treatment access for patients in Kenya, Nigeria, and Rwanda that need it the most. They're also advancing equitable biopharma research and development by filling a critical cancer data gap for the underrepresented population. Now, I'd like to pause and take a brief moment to honor Dr. Kingsley as he's here today with us. [applause/cheering] Amazing. Thank you, Dr. Kingsley, for all of the amazing work you do. I love this story, because it showcases how GenAI can augment our human abilities to solve many, many critical problems. GenAI will undoubtedly accelerate our productivity. That's why AWS is continuing to invest in tools that will completely reinvent the way you work with new applications and services with GenAI capabilities built inside. This includes tools like Amazon CodeWhisperer, our AI-powered coding companion. CodeWhisperer is trained on billions of lines of code that help you build applications faster with code recommendations in real-time, in your IDE. And earlier this year, we announced CodeWhisperer customization, which uses your internal code base to generate recommendations that are more relevant to your business. You can securely connect to your private repositories, and with just a few clicks, you're good to go. Your customizations are isolated to protect your valuable IP, and only the developers with the designated access will be able to use them. While customers are rapidly adapting CodeWhisperer to improve their productivity, it's not the only way we are helping to get more done with the power of GenAI. As I mentioned earlier, Q can help you remove the heavy lifting from common tasks, irrespective of what your job function is. This means Q can accelerate productivity, whether you are building applications, creating financial reports, presenting data to your exec team, or working in a call center. And just like how we are building Q to support our customers, our customers are also infusing GenAI capabilities into their own products to power assistive experiences for their customers. Now, let's hear from one such customer, Toyota, who is doing just that, with a short video. At Toyota, safety is embedded into everything we do. I often like to tell our engineers that we may not be doctors, nurses, or firefighters, but we have the opportunity to help save lives. People don't tend to think of a data and safety together, but ultimately, having the right data at the right time can help us determine if a vehicle has been in a collision, and it allows us to help get emergency responders to a customer's vehicle in the most efficient way possible. To achieve this, we pull the data from hundreds of sensors in the vehicle, from millions of vehicles globally on our platform, which equates to petabytes of data. So, the challenge was: how do we process all of that data in real-time when every second counts? We realized, wow, this is a really interesting engineering problem to solve, and one of the things that we love about AWS is there are so many options. It allows us to be so creative. For our cloud migration, we were able to seamlessly transport our data into AWS really easily by just flipping a switch. So, the moment that a vehicle is in a collision, there is a trigger event that goes from the communications module up to our AWS cloud. And then somebody from our call center will be talking to your vehicle within three seconds. At the end of the day, it's just super exciting to think about the future for Toyota vehicles. For instance, we are also using newer technologies like generative AI, and with a managed service, like Amazon Bedrock, we were able to ingest the owner's manual and develop a generative AI powered assistant, and it's going to be able to tell you anything about it by using some very simple voice commands, just saying, hey, Toyota, tell me about this icon. That icon means the traction control system has been enabled due to slippery road conditions. It's almost like saying your car is now going to be the expert on itself. Sometimes, when you get involved in the tech, you just get into the weeds of the code and fixing things, and you kind of forget the actual impact that you have on the world. There's a lot of pressure, but there's also a lot of pride in being able to wake up every day and know that that is what I'm going to be working on. [music playing] [applause] What a great story! I love seeing how builders are using the power of data and GenAI to solve real-world problems. While harnessing all of this technology is important to driving innovation, we also need to harness one of the most essential inputs, the power of human intellect, to create customer experiences that make a bigger impact, and when we look at how we can benefit from and strengthen the relationship between data and GenAI, we can think of ourselves as the facilitators that create a powerful cycle of reinforcement over time. Let me share a quick example to show you what I mean. This example comes from deep within the Panama Rainforest, where there are more than 1,500 species of trees. But there is one tree in particular, the Virola tree, that lies at the heart of our story. Growing to more than 130 feet tall, the Virola produces a small red fruit that is very popular with the local wildlife, including the toucan, as well as the small mammal called the agouti. As the fruit matures, toucans perch themselves in the treetops for a quick snack before dropping the leftover seeds to the forest floor, and since the agoutis can't climb, they welcome these little red seeds from the toucan. Then to prepare for the dry season, when the food in short supply, they also collect as many seeds as possible and bury them in the soil for safekeeping. Many of these will later grow when conditions are ripe, sprouting new trees that continue the cycle for decades or even hundreds of years. So, the toucan helps both the forest and the agouti, and the tree supplies food for both the animal species, and the agouti plants seeds in the soil, allowing new trees to grow. This story reminds me of the relationship between humans, data, and generative AI for a couple of reasons. For one, the relationship creates longevity, more trees, more food, and longer lives, and two, this cycle wouldn't exist without the collaboration and facilitation along the way, ultimately strengthening each element over time. As humans, we are responsible for creating and facilitating a flywheel of success with data and GenAI. That's because we provide a unique set of benefits that create more efficient, responsible, and differentiated GenAI apps. We created the innovation that generates data, making GenAI technology possible in the first place, as well as the data foundations that support it. We identify use cases and lead the development of GenAI apps that support our unique business needs, and at different points along the way, we provide valuable feedback to maximize the efficiency of these GenAI apps and the output that's generated. This is exactly what Ada Lovelace was talking about. One of the most common ways you can integrate human feedback into your GenAI strategy is the model evaluation and selection process. When you pick the best model for your use case, you can optimize accuracy and performance, while better aligning to your brand, style, and voice. However, model evaluation requires a deep level of expertise and data science, and it can be a tedious, time-consuming process. Customers will first need to create the right benchmarking datasets, metrics, and algorithms to calculate their metrics. Next, they need to set up a human evaluation workflow for subjective criteria, like friendliness and brand style, which can be difficult to build and operate, and finally, they often need to build benchmarking tools to pick the most appropriate model. This entire process will also need to be repeated periodically as new models are produced or new models are fine tuned. We wanted to make it easier for our customers to evaluate models for their specific needs. That's why, today, I'm very excited to announce the preview of Model Evaluation on Amazon Bedrock. [applause] This new capability allows you to quickly evaluate, compare, and select the best foundational model for your use case. With this feature, you can preview and perform automatic and human-based evaluations depending on your needs. For automatic evaluations, you can leverage curated data sets or use your own to evaluate quantitative criteria like robustness, toxicity, and accuracy. To evaluate subjective criteria, like brand voice, Bedrock offers fully managed human review workflows with support from AWS experts as well. And then we provide you comprehensive reports to easily review metrics on model performance. This capability is also now available in SageMaker Studio. Human input will continue to be a critical component of the development process with GenAI, and as our relationship with this technology evolves, so will the skillsets we need to unlock its potential. According to the World Economic Forum, nearly 75% of companies will adopt AI technologies by 2027. While some tasks will become obsolete, GenAI will also create entirely new roles and even new products and services, and employers will need to support them with the right people. Hard skills in ML and AI will continue to be important, but soft skills like creativity, ethics, and adaptability will grow increasingly critical with GenAI. Many are calling this the monumental shift, called the reskilling revolution. We are helping our customers prepare for it in a variety of ways. To support the workforce of tomorrow, we recently launched the AWS GenAI Scholarship with Udacity. This program provides more than $12 million in value in scholarships to over 50,000 high school and university students. We're also investing in our own AWS AI/ML Scholarship Program for years, which is making a profound impact on students globally. As someone who grew up with limited access to technology, I'm deeply committed to our investments in this area. Students and professionals who are joining this reskilling revolution are critical to the future of this industry. In addition to these opportunities, we offer more than 100 AI and ML courses and low-cost trainings. These tools will enable you to build new skills and get started with GenAI. We are also introducing new ways to experiment and have fun learning GenAI. That's why we recently announced PartyRock, an Amazon Bedrock playground [applause] Some PartyRock fans out there. It's an easy and accessible way to learn about GenAI with a hands-on code-free app builder. Since it released, users have created tens of thousands of PartyRock applications, using powerful foundation models from leading AI companies. And starting today, we will also include the Titan Text Express and Light models as well. Now, to get started with PartyRock, all you need is a social login. No AWS account required. So, let's see in action how easy it is to build a PartyRock app in a few steps. We start on the homepage by pressing the build your own app button and providing a brief description of what our app should do. Our new app contains a prompt, where we can experiment with different prompt engineering techniques and review the generated responses. We can also add additional LLM-based widgets, like a chatbot, to make our application more useful and fun to use, and we can even see and select different models to see what works best for your use case. We can then use our chatbot to discuss the results and engage in further conversation. And once we are happy with the results, we can publish our application to the world and invite other people to use it or remix it to make their own version. You can recommend your apps on social media or get featured on the new Discover page, featuring community apps by the PartyRock team. You might even find my own personal favorite, the app my daughter created over our Thanksgiving weekend, that helps you create your own chocolate factory. I encourage you to check out PartyRock today. With all of this data and GenAI innovation happening around us, it is important to remember that each of you will continue to bring your own unique inputs and ideas to the table. Like Albert Einstein once said, "Creativity is seeing what others see and thinking what no one else thought." Today, the powerful symbiotic relationship between data, GenAI, and humans is accelerating our ability to create new innovations and differentiated experience, and we are just getting started. From a secure place to customize your foundation models with your data, GenAI-powered services to strengthen your data foundation, tools with GenAI built in to augment employee productivity, and mechanisms for implementing human feedback, AWS has everything you need to unlock this transformative technology. And with tools like PartyRock to help you kickstart your GenAI journey, I look forward to seeing what you create next with AWS. Thank you. [applause/music playing]
Info
Channel: Amazon Web Services
Views: 147,720
Rating: undefined out of 5
Keywords: AWS, Amazon Web Services, Cloud, AWS Cloud, Cloud Computing, Amazon AWS
Id: 8clH7cbnIQw
Channel Id: undefined
Length: 111min 28sec (6688 seconds)
Published: Thu Nov 30 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.