Google Cloud Applied AI Summit Google Cloud Applied AI Summit >> THOMAS KURIAN: Good morning. Today is a really exciting day for all of us. We have been working on AI for years, and our goal has been to create a platform that allows people to build applications that can understand, reason, and act on information in the same way that humans do. Human beings understand the world through an amazing array of information all at once. We see, hear, read, listen, and talk about many different types of information -- text, audio, code, images, and video -- simultaneously. People want to build with AI that can take this reality into account. These advances will allow developers and businesses to custom-build agents that can reason, understand, and do more on people's behalf. Last week we made a massive step forward towards this vision, with Gemini, Google's most capable model. It's the world's first native multimodal model, capable of making sense of text, audio, code, images, and video, all at once. Gemini has significant breakthroughs in advanced reasoning and understanding. Today we'll show you a glimpse of some of the advances that are made possible by Gemini. We will also talk about our new AI Hypercomputer, advances in our Vertex AI platform that help developers build amazing things with Gemini, and new capabilities in Duet AI to help developers and cybersecurity analysts be more productive. Let's watch a quick introduction of Gemini, Google's largest and most capable AI model. >>> Goim our largest model, it means that Gemini can understand the world around us in the way that we do, so not just text, but also code, audio, image, and video. >> We use Gemini for key information. >> We wrote a prompt with its advanced reasoning capabilities, Gemini was able to distinguish between papers relevant to the study, and those that weren't. >> I'm delighted to introduce alpha code 2, powered by Gemini. When we evaluate it on the same platform as the original offer code, we solve almost twice as many problems. >> Gemini on its own has the ability to transform software development as we understand it. >> Safety and responsibility has to be built in from the beginning. >> And that has oriented us to be both and responsible together. >> Developers and enterprise customers are going to figure out really creative ways to further refine our foundational models. >> Gemini will be available in three sizes -- Gemini Ultra, our most capable and largest model for highly complex tasks. Gemini Pro, our best performing model for a broad range of tasks, and Gemini Nano, our most efficient model for on-device tasks. >> It's been a monumental engineering task, which has been challenging but exciting. >> THOMAS KURIAN: Here to tell us more, Eli Collins, Google DeepMind's vice-president of product. >> ELI COLLINS: Thanks, Thomas. Gemini represents a large-scale collaborative effort of people from across Google. Gemini is our largest and most capable a. I model, and our most general. We've rigorously tested Gemini models on 30 of the 32 widely used industry benchmarks, our largest and most capable model, Gemini Ultra, exceeds current state of the art results. In fact, on one of the most popular benchmarks, multimodal, or massive multitask language understanding, Gemini is the first AI model to outperform human experts with a score of 90%. And notably, on the new multimodal benchmark, Gemini achieves a state-of-the-art score of over 59%. We've optimized Gemini for three sizes. Which means it's able to run on everything from mobile devices, to data centers. Gemini Ultra is our most capable and largest model for highly complex tasks. Gemini Pro is our best model for scaling across a wide range of tasks. And Gemini Nano is our most efficient model for on-device tasks. So those are its performance stats. Now let's get into a little more about what makes Gemini such an exciting advancement over existing models. Gemini is natively multimodal, it has sophisticated reasoning capabilities and can code at an advanced level. Let's jump in. Gemini is built as multimodal from the ground up. That means it's trained on different data types, simultaneously allowing it to understand and reason across them seamlessly. Traditionally, creating multimodal involves training separate components for different modalities and stitching them together to roughly mimic some of this functionality. But because Gemini was pretrained on the start from text, images, audio, video, and code, it can seamlessly have a conversation across modalities, and give you the best possible response. Gemini also possesses exceptional reasoning skills. It can analyze vast amounts of complex information, and extract insights to digital speeds, making it valuable in science and finance, really any industry that wrestles with that challenge. Another area that's very near and dear to us at Google is Gemini's advanced coding capabilities. Gemini is one of the leading foundation models for coding in the world. Gemini Ultra excels in several coding benchmarks, including HumanEval, an important industry standard benchmark for coding tasks, and Natural2Code, our own, more reliable coding benchmark. In the future, programmers will make use of highly capable AI models as collaborative tools that can assist with the entire software development process, from reasoning about problems, to assisting with implementation. I've been fortunate to experiment with Gemini for the past few months, and I can't wait to see how you all build transformative applications using it. I'll hand it back to Thomas to tell you more about how Gemini is coming to life in Google Cloud. >> THOMAS KURIAN: Thank you, Eli. Last week we announced early access of Gemini Ultra, for select customers. Starting today, Gemini Pro is available to all developers in Vertex AI. Early next year, we'll also make Gemini Ultra available in Vertex AI, and Gemini will be used in Duet AI, in both Google Workspace and Google Cloud. Gemini Pro was designed to be our most efficient model to serve, so we priced it to be accessible to all developers. As for price, with a focus on hardware and software optimization, we've reduced our pricing by four times per input character, and two times per output character. At the same time, improving model quality and reducing latency. Now let's talk about the AI optimized infrastructure on which Gemini was trained and is being served, how developers can access and build agents powered by Gemini, and new capabilities we are introducing with Duet AI to help developers and security analysts. Let me start by inviting Amin Vahdat to share more about our AI infrastructure. >> AMIN VAHDAT: Thank you so much, Thomas. Let's take a quick look at some before our groundbreaking AI infrastructure innovations. >> AMIN VAHDAT: From fully integrated AI optimized stacks, to more specific TPU AI supercomputers, advancements in AI infrastructure must continue to meet and exceed the computing demands of our customers. In terms of demand, with the growth of Gen AI, the number of parameters in Large Language Models has increased by 10x a queer for the past six years, that's a total of a factor of 1 million. Hundreds of billions, if not trillions of parameters bring new levels of sophistication and capability, but they also, of course, bring heightened requirements for training, tuning, and inference. We have seen rapid innovation through the years with evolution of our TPU AI supercomputers. We trained Gemini 1.0 at scale using Google's in-house designed tensor processing unit, or tups, v4 and v5p. And we designed Gemini to be our most efficient to serve. On TPUs, Gemini runs significantly faster than earlier, smaller and less capable models. For more than a decade, these custom TPUs have been at the heart of Google's AI-powered products that serve billions of users, like Search, YouTube, Gmail, Google Maps, Google play, and Android. They've also enabled companies around the world to train large-scale AI models of their own, cost effectively. Last week we were thrilled to announce our most powerful and scalable TPU to date, cloud TPU v5p. The P in v5p stands for performance. What's pictured here is just seven racks out of the 140 that make up a full TPU v5p super computer. It features more than twice the FLOPS per chip, and is four times a total performance relative to the previous generation TPU v4. V5p allows our team to train models with even lower latency, which accelerates serving future versions of Gemini. It also opens up unprecedented capability for our customers to take on the most demanding workloads. These TPUs are designed with judicious specialization, coupled with application code design, powered by a synchronous high bandwidth interconnect, supporting communication at 4.8 Tera bits per second per chip. Customized liquid cooling for maximum system efficiency and performance per watt. Optical Circuit Switching for tolerance and the most efficient scheduling, specialized data representation, specific to ML workloads, high bandwidth memory with 10 times the speed, specialized tensor and cores for the most advanced mathematical operations. These advancements as well as many others, enable our TPU AI supercomputers and most recently our TPU v5p, to power the next generation of multimodal foundational models. But everyone those a chip is only as powerful as the full stack that surrounds it. That's why last week we announced AI Hypercomputer, a groundbreaking super computing architecture from Google Cloud that optimizes our integrated stack of performance optimized hardware, including the advancements I shared that are powering TPU v5p. Dynamic and efficient software, leading ML frameworks out of the box, like jacks, PyTorch and tensor flow, and the most versatile resource management tools. Based on decades of research. What's exciting is the system level productivity it drives. Computing at the scale of again AI demands efficiencies, but this is traditionally done via piecemeal, which can lead to bottlenecks. AI Hypercomputer applies system level design to boost fish at the and productivity across AI training, tuning, and serving. Together, AI Hypercomputer, with TPU v5p, enables faster and more scalable model training, allowing new products and capabilities to reach customers. Sooner. But no two AI workloads are created equal, which is why we're investing in a broad portfolio of infrastructure for AI. At Next, we introduced the other member of the family, TPU v5e. It's it delivers 2.3X bedder training and 2.7x better inference performance per dollar than v4. It is our most efficient -- cost efficient TPU to date. We also announced the FA of A3VMs with Nvidia, which achieves 3 times AI training. We have tremendous interest from customers for our AI infrastructure. More than 50% of generative AI start-ups build on Google Cloud. And more than 70% of generative AI unicorns are Google Cloud customers. Customers like Anthropic and Cohere are trusting us with their growing demands for AI infrastructure. And with that, back to you, Thomas. >> THOMAS KURIAN: Thanks, Amin. Let's show you now how developers can access and build with Gemini's API, starting today. Today we're making Gemini Pro available in Google AI Studio, a new free web-based developer tool that helps you prototype and launch apps quickly with just an API key. When it's time for fully managed AI platform, Vertex AI, our enterprise AI platform, allows you to customize and tune Gemini while addressing enterprise needs for security, safety, privacy, data governance, and compliance. Vertex AI helps developers discover models, tune models with their own data, augment them to include up to the minute information, and take real-world actions. It allows them to manage and scale models in production, and to build search and conversational agents in a low-code,/no-code environment. Today we'll talk about how we're improving the developer experience in each of and every one of these areas. With the addition of Gemini, Vertex AI customers can now choose from over 130 models, made available as APIs, including Google-developed models like Gemini, and a curated list of open-source and third-party AI models, all of which meet Google's strict enterprise safety and quality standards. Today we're pleased to announce six new and updated models, including Gemini Pro, of course. Let me just touch on a few of them. Imagen2 is a major update to Google's state of the art image generation model. You can create custom images with improved photo realistic image quality, text rendering to easily create images with text overlays, and the ability to generate logos. We are also excited to announce general availability of MedLM, a suite of medically tuned models. This builds on our innovations in Med-PaLM2 to offer flexibility to healthcare organizations. We continue our commitment to an open model ecosystem, bringing leading open source models, Mistral, Imagebind, and DITO to Vertex AI's Model Garden. We help protect customers in many ways as well. Today we are extending our indemnification promise to now include model outputs from PaLM 2 and Imagen 2. This means if you're challenged on copyright grounds, we will generally assume responsibility for the potential legal risks involved. We will update our indemnification promise to include Gemini just as soon as it's generally available. Now that you have access to the best models, you want to customize them with your data, for your brand, and for your needs. Vertex AI offers the most robust set of tuning capabilities of any platform, making it easy for developers of all skill levels to customize models. For instance, in banking, this lets you build agents that can access highly private data securely and tune the model to offer the best client service, and with your own brand voice. We offer multiple ways to tune Gemini with your data to create your own model, optimizing performance and model quality. These techniques include prompt design, which lets you tune Gemini without ML expertise, using simple text prompts. Supervised adapter-based fine tuning, which lets you use as few as 100 examples to improve the model's performance. You can also use adapter tuning with reenenforcement learning with human feedback, and you can even do full fine-tuning with open source models. We're also introducing a new capability today called distillation, which allows you to train your own smaller, task-specific models, with generally much less training data, and with lower serving cost and latency than the original model. Distillation is a technique only available on Google Cloud. All approaches to tuning and distilling models ensure that your proprietary data is safe. We provide a comprehensive set of controls designed to keep your data secure and private. With Vertex AI, your input data, your output data, and any RLHF feedback are yours alone, and not accessible by anyone else when you use Gemini or any other foundation model on the Model Garden. Gemini is exceptional at understanding large volumes of information and producing outputs. But to be truly useful in the real world, Gemini needs access to up to the minute, real world information to contextualize its responses and to take action based on this information. Vertex AI provides you with augmentation tools to accomplish this. With grounding on Vertex AI, you can improve the accuracy and relevance of Gemini's answers using your enterprise's own data, and using citations to understand the source of its responses. We offer out of the box grounding against an enterprise's structured and unstructured data. And we are working with a few early customers to test grouping with the technology that powers Google search. Another way to equip Gemini and other foundation models with up-to-date enterprise information is via Retrieval Augmented Generation, or RAG, a technique that involves fetching data from many different data sources, including your own company's, enriching prompts with that data to deliver more relevant and accurate responses. Vertex AI offers fully managed RAG platform with an automated end-to-end RAG work flow, including ingestion, retrieval, prompt augmentation, and citations. This can eliminate the need for you to have to write complex custom code to integrate data sources and manage queries. We also offer the option to build custom RAG solutions with your choice of vector database and high-quality embed multimodal embeddings. Once Gemini has the right data, you can enable it to take action in the real world with Vertex AI Extensions. Extensions can be used to build powerful AI agents that can, for example, ingest internal code bases and automatically look up evolving security threats in real time, balance labor a you assignments based on real time utilization, open work, and deadlines, aggregate Threat Intelligence, generate proposals, and communicate to security teams for fast incident response, and even to book calendar appointments, make payments, and take other actions on your behalf. And now, Vertex AI is introducing a new capability. Which allows developers to provide functions as part of the API call to improve model responses. Gemini's output can now be fed to the external tool and API of your choice, giving you as a developer more control and flexibility. Once AI agents are built, Vertex AI's ML Ops helps you evaluate, manage, and deploy large models. Evaluating the quality of a model's response relative to other models is a significant challenge in deploying models. Vertex makes this so many easier. We're now pleased to announce a new capability called Automatic Side-by-Side, which Ken ABC automated A/B Testing for instance with two different sizes of models or two different custom versions. In addition to Google AI Studio and Vertex, we've integrate touchdown our AI models, including Gemini, into popular platforms. Like Colab Enterprise, Firebase, and Flutter. With these integrations, any developer can now be an AI developer. Let's see now a demo of Gemini and Vertex AI together with Nenshad. >> NENSHAD BARDOLIWALLA: Thanks very much, Thomas. Sorry you're feeling under the weather. But we're going to make this a fantastic experience for all of you. So I'm going to start out by showing you an example of how you can use Vertex AI with Gemini. I'm going to play the role of a developer, working for Similar Ball Home Rentals. We want to create a listing experience that's easy for hosts. How easy? We should be able to upload an image or video and be able to get really nice descriptions very quickly. Let's see how we're going to do this with Vertex AI platform. Vertex AI provides a unified set of capabilities for both predictive and generative AI workloads. I can prepare data, do model development, and deploy and use those models. But today as a developer, I usually start my journey in what we call the Model Garden. And in the Model Garden, I have access to our first party models, open source models, and third-party models across a wide range of tasks and modalities. But the star of the show today is Gemini. So let's see it in action. I'm going to go ahead -- you can see front and center, we have Gemini in the Model Garden, I'm going to click on Gemini. And you're going to see that I get an overview, a set of use cases that talk about visual information seeking, being able to do captioning and description, and even advanced reasoning. You'll also see that I get information directly inside the platform that tells me exactly how to use this if I'm a coder. So I don't have to use it if I don't want to, I can go straight into my IDE of choice and use the Vertex AI SDK right on the fly. Now, in this case, I'm going to go and open Vertex AI studio, and I'm going to start my use case. Let me go and copy and paste my prompt here, which says, generate a description of what you see in this video. Yes. Video. Because Gemini as a multimodal model, is able to work with video. So I'm going to go ahead and insert a video right here, and this is going to be of the home rental that I want to create a listing for. Let's click on it and see it. It's a really, really beautiful place. Looks like a very nice kitchen, very nice living room, and it's a beautiful day. We should be able to get a great listing out of this. So now I'm asking Gemini to generate a description, list all the features of the interior of the house, and try to be factual and in a tone that would appeal to guests. Let's go ahead and submit that. In the back ground as you heard from Amin earlier, TPUs, GPUs, all the advanced infrastructure working to bring a Gemini result. Look what we get. It talks about a beautiful home, the open floor plan, the back yard, and I get this in a bulleted list. This is great, but our style is a little more fun, a little more zany, kind of like me. And so I'm going to use the capabilities that Thomas just talked about, our tuning capabilities, to get the model to work more in my style. So I'm going to go ahead and head over to tuning and distill, the new distillation capabilities we just launched, and I'm going to create a tuned model. And I'm going to call this tuned model Cymbal Home Rentals Model. I'll use Gemini Pro, I'll pick an output directory. I'll continue, all I need is a JSONL file with 100-500 examples to teach the model how to speak the language of Ccmbal. I'll pick my Cloud storage location, and I can even enable the new model capabilities in a single click. This can take about an hour. So instead of holding you on the video broadcast live right now for an hour, I'm going to go ahead and use a pretrained model I created yesterday using the exact same dataset. And now let's see what it does. I'll go ahead and go back into the studio. I'll copy and paste that exact same prompt again. Let's also go ahead and insert the same video that we saw before. But now instead of Gemini Pro, I'm going to use my tuned Cymbal Home Rentals model. Let's see if the output looks like what I wanted it to. We'll give it a second as Gemini does all of its amazing processing. And there we go. This is the brand style that we love at crp cymbal home rentals. We Talk About a Testament to Architectural Grandeur, That's How Our customers love to talk about their properties, and now Gemini speaks that way too. Of course, Gemini speaks many languages. 38, as a matter of fact. So let's see what happens if I ask Gemini to go ahead and speak in Spanish. I'm going to click submit. And we will see what Gemini does with Spanish translation. There we go. [ speaking Spanish ] Thank you to my high school Spanish teacher for training me to be able to say this on a live broadcast. So this is great. This is exactly what I wanted Gemini to do. Now, you should also know that everything I'm doing here in a point and click fashion is also available in code. It's very easy to be able to use python, no JS or java, it's a symmetric experience across any modality. So this is great. Now, I want to teach Gemini to be able to talk to the outside world. So let's go ahead and actually show our new extensions capability, which we just launched. I'll head to the builder. I am going to go and reach out to the Cymbal Publishing API and I'm going to upload an open API spec file called Home Rental API Operations. You're going to see Vertex AI comes back and shows me all of the operations I'm able to do with this model, and this extension. So I can export the description, publish listings, but I can also get real time information from the outside world. So let me go ahead and create this extension. And now I can actually test it. In fact, I'm going to ask a question here. How much should I list this house for? Please provide weekly rates and keep in mind December holidays. And let's see the real time capabilities in action of what we can do. Look at that. The extension comes back in the back end, we're using Vertex's vector search capabilities, as well as our multimodal embeddings to compare all of the different house listings, the videos of those house listings, and the text, and we get back the results. December 1st-December 23rd, looks like the rates are pretty reasonable, and no surprise, starting on the 24th, the rates go up. That means this is doing exactly what I hoped it would. Now, everything that you've seen so far as Thomas talked about, is available today when you log in to Vertex AI. But now I want to show you the art of the possible for what's to come with Vertex in the next few weeks. We saw an example, I'll go back to our AI Studio -- of what would happen when we go to video. I'll copy and paste this between to recreate it. And I'll insert my video. But you heard earlier from Eli that this is a model, Gemini, cape aable of working with images -- capable of working with images, text, and video, and doing advanced reasoning. So let's see how that's going to work inside of Vertex. We saw video before, but that was only for the interior of the house. Let's go ahead and now add the front and the rear of the house in pictures. Pictures, text, and video, all together, simultaneously with Gemini. Let's go take a look at the pictures, there's the front of the house, there's the back of the house, and here is the same video that we saw before. And now I'm going to tell Gemini to generate a description of what you see in this video, and these images. And in the description, list all the features in the interior and exterior of the house. And I'm hoping that Gemini will show me that it's able to reason that these are all pictures about the same house, inside and outside, and come up with an amazing description for my customers. Let's go ahead and submit that. We let Gemini do its magic. And look at that. Picture this, a stunning abode, adorned with modern architecture, and expansive windows that flood the interior with natural light. So Gemini figured out we were talking about the outside of the house with the images. But then, now Gemini starts to talk about being able to be greeted by an open floor plan, and the living room, and the fireplace. That's the inside of the house. Gemini was able to connect the inside and the outside of the house together. Across text, images, and video. This is the power of multimodal inside of Vertex. Vertex's Gemini multimodal support for images and video are new tuning capabilities, are multilingual support, and our extensions, they give me all the pieces I need as a developer to make home listings easy for our customers. Back to you, Thomas. >> THOMAS KURIAN: That was awesome. Can't wait to get it in the hands of all developers. Now let's talk about how Vertex AI makes it easy to create search and conversation agents quickly, without writing code. Whether you are looking to generate, retrieve, or summarize information, you are going to need to search your enterprise's data for most generative AI-powered'ses and agents -- applications and agents. Vertex AI search provides you with a Google quality search experience, out of the box information retrieval, and answer generation system. Vertex AI search benefits from Google's deep experience in search, including implementing a hybrid of semantic and key word search, understanding user behavior and intent, document an notation with knowledge graph, and reranking techniques using state of the art model distillation technology. Building a Google quality search engine for Retrieval Augmented Generation is hard. Vertex AI search makes the entire process of ETL, OCR, chunking, embedding, indexing, reranking, storing, retrieval, and summarization super simple. Today we're pleased to announce three important new advancements in Vertex AI search. First, we will very soon be integrating the power of Gemini into Vertex AI search as an option for summarization and answer generation, enhancing the quality, accuracy, and grounding capabilities of your search results. Second, we are pleased to announce blended search support now in preview in Vertex AI search. What is blended search? It allows you to run a single query across structured data. It allows you to run a single query across structured data, like your product catalog, unstructured data like customer chat logs and forums, and public web pages like your product descriptors. This means you don't have to build separate search engines for each type and modality of data and then integrate them. Blended search means users get better quality search results, faster. Third, we're introducing Vector Search. Vector Search is for customers who have more complex use cases, and want to build custom search applications using multimodal embeddings for ad-serving or e-commerce recommendations. We offer tuning capabilities for vector search, using platform offerings like vector search, multimodal embeddings, or our database. Vector search can now also be used by Gemini to power your advanced search applications. Vertex AI Conversations facilitate the creation of natural-sounding, human-like voice and chat-based conversational agents. Starting today, you can choose Gemini as an optional foundation model for your conversational agents, using Gemini's advanced reasoning to handle more complex questions. We're also introducing a few important new capabilities. First, playbooks, which are now in preview, playbooks allow you to teach the conversational agent in the same way you would a person, giving it instructions in natural language. Playbooks leverage Vertex AI's comprehensive suite of data connectors and extensions. Playbooks leverage Vertex AI's broad suite of connectors and extensions, allowing conversational agents to serve fresh information to users and complete actions such as payments, bookings, scheduling meetings on users' behalf. You can enter the agent persona, goals, and the steps you want the agent to follow in simple natural language. You can also provide examples of how it's done, and you can preview your agent's responses by testing it before deploying it. Second, Vertex AI Conversation now has an inline simulator to improve the quality of your agents' responses with full debuggability, and a streamlined feedback loop process. Third, conversational agents can be built once and deployed across multiple channels, including web, mobile app, call center, and in retail points of sale, for example, in retail stores or bank branches. We're seeing strong momentum with Vertex AI Search and Conversations, for instance, helping customers and employees with research assistance, food ordering, travel planning, and so much more. Let's hear from Forbes, who recently launched their Adelaide search experience. >> I'm David Johnson, chief data officer for Forbes media. We've built Adelaide, our new interactive search experience. So you could have a user that comes to Forbes looking for travel insight. You could type, what are the top 10 vacation destinations, what are great hotels? And it maintains the context, Vertex AI search and conversation was a light lift for our data science and ML ops teams. We had a proof of concept up in a matter of two weeks, unprecedented for us. With Google Cloud, Forbes is building a new way for our users to can a says the data they want. Is. >> THOMAS KURIAN: Now that we've covered Vertex AI and Gemini together, let's look at how we at Google are building our own agent, Duet AI, to assist people at work. Please welcome Gabe Monroy. >> GABE MONROY: Thanks, Thomas. Earlier this year we unveiled Duet AI, annals-on collaborator. Duet AI works in Google Workspace, and has been GA since August to help write and refine in Google Docs and Gmail, generate custom images in Google Slides, and more. Duet AI also works in Google Cloud, helping with development, operations, data insights, cybersecurity, and more. Today we are announcing that Duet AI for developers and Duet AI in security operations are both generally available. Let's start with what we are doing for development teams. AI assistance brings huge productivity gains when it comes to software development. AI assistance can also improve the quality of the software that is being written, for instance, by improving documentation of code, ensuring that code is better tested, and improving how software dependencies are checked more rigorously. With the GA of Duet AI for developers, we're here to make every dev team more effective. Duet AI for Developers is the only assistant fine-tuned to to provide the best possible answers for developers using Google Cloud, all with their favorite tools and IDEs. Duet AI for Developers is also the only AI assistant on the market that goes beyond AI-powered code and chat assistance to running applications on Google Cloud. With unique capabilities around application deployment, trouble-shooting, and issue remediation. Priyanka, let's show how easy it is for developers to build, deploy, and operate a real world application in less than five minutes with Duet AI for Developers. Prieng, over to you. Proirntion over to you. >> PRIYANKA VERGADIA: There are a lot of capabilities developers expect from a coding epidemic assistant, like code explanation and code generation. As you'd expect, Duet AI is fantastic at these. Let me quickly show you. Here let's ask Duet AI to explain me this code. And there we have it. An entire function explanation for our code. Now I also want to add unit tests. I can do that with just one click. See? An entire testing Sen ar Yankee is set up for -- scenario is set up for us. I also want to use Duet AI to improve the readability of this file by adding documentation and comments directly from my development. I'm going to say add comments to this code, and I want to do it for the entire file. So here we go. And Duet AI has generated this entire file, again for me, with comments now. So making it a lot more readable. Now, let's say I want to generate some code. Here I am in another file, and I've already typed up a comment in natural language. Which is going to generate a new route, a new function for our that's going to take sales data and our product information and create a new function for us. So here I'm asking Duet AI to generate this function, and there. I have an entire function generated in high-quality custodian. code. This is cool and makes Duet AI a joy to use. But these are things you would expect in a coding assistant. What makes Duet AI unique is that it is the only coding assistant that is native to Google Cloud, fine-tuned on Google Cloud best practices, and designed to help you across the entire software development life cycle, including running and operating applications. Now, let me show you what I mean. So for this code, I have the data in a JSON file. I'd like to store this data in Firestore, which is Google Cloud's document database. I'm not an expert on Firestore, but fortunately, Duet AI is. I'm going to ask Duet AI chat to respond to me with a code so generate code to create a function that uploads this data into Firestore database. And Duet AI, because it is trained on Google Cloud Best Practices and Products, has generated this entire function for me, which all I need to do is insert it to my current file. And I'm done. Now I'm ready to deploy this web app to Google Cloud. So let's ask Duet AI to help deploy in a container in Google Cloud, which is the home for Kubernetes. So I am going to ask what are the services to deploy containers in Google Cloud? And this should be able to give me options that I have to deploy in Google Cloud. I have GKE, Cloud Run, I'm going to select Cloud Run because we want to go serverless. And now I want to know how do I deploy my application to Cloud Run, and because we are in Cloud Code, I'm going to ask it a little more specifically to generate steps to deploy from Cloud Code. There, we have all our steps, I'm going to follow the steps now and click on deploy, and there it is going to try to deploy with after following these steps, I have triggered a deployment. This may take a few seconds, but Duet AI is the only AI assistant that enables a smooth journey from development, to deployments in Google Cloud. Now, while it is taking a few seconds, I already have a deployed version that I've set up for us here. Let me click on that one to show you the completed deployment. We have deployed our app, where you see how much we've sold from these products this month. And with that, we've built and deployed the app. Now, running and operating applications is equally as important as coding and developing them. To show you what I mean, let's check the logs from my running application to see if things are okay. So I can go do that in my log's explorer right here. I'm going to click on one of my deployments to open up log's explorer, and I can see all of my logs, but let's say I want to actually drill deeper and perform some advanced tasks. And for that, I'm going to click and get into my logs explorer in Google Cloud Console. This is one of the logs that I'm interested in, it's catching my eye because it says, insecure request warning. That is not supposed to be good. So let's drill a little bit deeper into that, and I see this is explained, so I'm going to click on that, and here as well, I love how Duet AI is still by my side here in the Console where I'm trouble-shooting a live, running application. Now, here it looks like we have an insecure HTTPS request, so we might want to secure that, so I'm going to ask some follow-up questions to Duet AI to help me fix this code to make it more secure. So let's say I ask how do I secure my insecure HTTPS connection in a Python file? Now, I'm making all these spelling mistakes, but I hope Duet AI catches them. And there, it has done that. And it's telling me a little bit about what I can do, which files I can take, and I can just -- it's an easy fix. Which will help improve our security quite a bit. Voila. With Duet AI, we have turned our hour-long task of building a real world app in just five minutes. My team is going to be so impressed with me. Back to you, Gabe! >> GABE MONROY: Thanks, Priyanka. So amazing to see a real world app built so quickly. What I found most impressive is that Duet AI is the only AI assistant on the market that handles the entire software development life cycle, from development, to operating a running application request quality at every step. With Duet AI, we're helping leading brands like PayPal boost developer productivity, and we're enabling retailers like rE Appliances to gain new insights. Soon customers will be able to fine tune the underlying models for Duet AI with their own Codebase, so developers can receive code suggestions informed by their internal private code. We're testing this with customers and expect to be generally available in 2024. Today we're also announcing more than 29 code-assist and knowledge-base partners who will contribute datasets specific to their platforms, so users of Duet AI can receive AI assistance based on partners' coding and data models, their product documentation, and best practices. Code-assist partners help Duet AI for Developers provide technology-aware coding assistance with partners like confluent, HashiCorp, and MongoDB. Duet AI for Developers will be trained to generate code that is specific to those platforms, so that developers can build and trouble-shoot faster. Knowledge-base partners allow Duet AI for Developers to provide access to documentation and knowledge sources with partners such as Datadog, JetBrains, and Langchain. Lastly, Google Cloud Service partners play a critical role in helping customers adopt Gen AI, including Duet AI. These partners have committed to train over 150,000 experts, and can help enterprises bring Duet AI into their existing development work flows. Next, I'd like to talk about security. In Google Cloud, we help protect customers' applications and businesses end-to-end. First, we produce world class Threat Intelligence via Mandiant, which provides information on the latest attack vectors, as well as assessments of who may be targeting our customers. Second, we use that Threat Intelligence to quickly detect, investigate, and remediate security incidents through Chronicle, across all of the customer environments, including other clouds. Third, for Google Cloud specifically, we go deeper via Security Command Center that understands most parts of the customers' applications running in Google Cloud to identify misconfigurations, vulnerabilities, and threats, and help mitigate and remediate those risks. Across each of these, we are infusing Duet AI into key security work flows, which helps security customers detect threats faster, and reduce the toil for security operations to protect their organizations. Today, with Duet AI in Security Operations we're the first cloud provider to make Gen AI generally available in a unified sec ops platform. This incorporates our Leeing security intelligence to help security professionals protect their organization and respond to a barrage of alerts and security anomalies each day. Can Duet AI in security operations, security professionals can search event data and query across many different log types without needing to know specialized syntax, saving valuable time. Early user feedback showed that Duet AI reduces the time analysts spend writing, running, refining searches, and triaging complex cases by approximately seven times. Priyanka, let's show how you can find security risks and address them quickly with Duet AI in Chronicle. Krone security operations. >> PRIYANKA VERGADIA: Thanks, Gabe. Duet AI in security operations helps improve security for teams. Here I am in Chronicle. I'm looking to get some insights into the crypto mining attacks in our environment in the last week. So I type my query in natural language, and Duet AI converts my request into a query that Chronicle can understand. From here I can click on search, and Chronicle is going deep into understanding all this data and getting me a list of all the alerts around crypto mining. And once it gives me that list, there we go, here's our list, I can actually drill deeper into these alerts and find out a little more about what is happening in our environment. So here I am in a detailed alert view. And Duet AI is following me along as an AI assistant here as well. Instantly summarizing case data and alerts for me, including what has happened and why it matters, and also recommendations on how to respond next. In the world of security, every second counts. These are critical details any security practitioner needs, whether they are new to the team, or a seasoned SecOps analyst. Back to you, Gabe! >> GABE MONROY: Thanks, Pr Priyanka. Duet AI in Security Operations launched in preview earlier this year with over 150 customers. Feedback has been overwhelmingly positive with less than 1% of requests receiving a negative rating by users. Customers like Pfizer are excited about Duet AI's potential and the ability to use simple language queries to search petabytes of data and perform complex analysis, onboard analysts faster, and rapidly pursue advanced threats. To get started on Duet AI, please explore the Security Operations Enterprise and Enterprise Plus packages. You can get started at no cost on Duet AI for Developers until February 1st, 2024, by visiting our website. And stay tuned for more to come in 2024. Before I hand it back to Thomas, let's hear from Turing, who's been using Duet AI to improve developer productivity. >> I'm CEO and cofounder of Turing. Turing is the world's first AI-powered tech services company, with over 3 million software engineers on our platform, we use AI to automatically vet them, match them, and to speed up the productivity of the software engineers that we deploy to companies. Tools like Duet AI help us give super powers to every software engineers Jena Sixer it helps us ship higher quality work faster. We saw productivity improve by over 30%. I think of what a planet of 7.8 billion people could do with 10x more productivity. It's going to be a wonderful world. >> THOMAS KURIAN: Thank you, Gabe, and Priyanka. Along with our AI technology stack, we have been on a mission to build the most innovative and open ecosystem for customers to build AI agents. This includes foundation model partners, like AI21 Labs, Meta, Mistral, tools and applications partners, like SAP and Workday. Our network of professional services partners helps customers successfully build and bring generative AI to their organizations. One of our largest professional services partners globally is Azamat Tazhayakov. Accenture. I'm pleased to announce an expanded partnership to deliver generative AI to enterprises, to improve operations, create new lines of business, and to build amazing and unique customer experiences. We will provide engineering and rapid prototyping services for generative AI, including Gemini. We're also pleased to be working with a large number of customers who are using and building with our generative AI models. Customers like AI21, Bending Spoons, Lightricks, Applovin, and AssemblyAI are trusting us with their growing demands for infrastructure. Customers including McDonald's, Carrefour, Shutterstock, Bayer, Spotify, Dunn & Bradstreet, Formula E, The World Bank, Yahoo!, and JetBrains are using Vertex and our models to build their own agents. And L'Oreal, Liberty Global, Purple Soil, Axmos, for instance, are using Duet AI to increase productivity and resolve security threats faster. We are very pleased to announce a new partnership today with HashiCorp. Our teams will work together to launch new AI-powered feat features -- our teams will work together to launch new AI-powered features across HashiCorp's product suite, all built on Vertex AI. Duet AI will also make it easier for Google Cloud users to use Terraform. Today's announcements wrap up a super busy year for our engineering teams. Here are some statistics for you. Since January, we have introduced a number of important new capabilities in our AI Hypercomputering infrastructure with new advances in GPUs, TPUs, ML software, compilers, frameworks, Workload Management, and many others. We have introduced three new major generations of models, culminating of course with the introduction of Gemini today, with over 500 features in Vertex AI and a new suite in Duet AI in both Workspace and in Google Cloud. We've seen amazing, unbelievable developer growth. For example, just last quarter, the number of active generative AI projects in Vertex AI grew by seven times in just one quarter. We're wrapping up the year with a slate of new advances today. We introduced Gemini, a state of the art model that delivers a variety of state of the art features, including native multimodal and advanced reasoning, we introduced a new AI Hypercomputer that leads the world in performance, latency, and cost in both AI training and serving. We brought Gemini to Vertex AI, starting with Gemini Pro, along with significant new features to build agents that combine native multimodality, advanced reasoning, up-to-the-minute knowledge, and to take actions, all fine-tuned to your data. With Duet AI for developers, we help you develop, document, test, refactor, and modernize code while working within the development tools and platforms that each of you developers are familiar with. We also released Duet AI in Security Operations to help protect you and your company from cyberthreats. We have updated pricing that makes Gemini accessible to every developer, and we are introducing expanded indemnification to help protect you from copyright concerns. We continue to take important steps to enable developers to build agents that think, understand, and act on information the way that we as humans do. At the same time, we remain firmly committed to our promise to be bold and responsible in bringing you these advances. Thanks for joining us today. Next, you can get a hands-on deep dive with our developer advocates Nikita and Dale, and then dive into breakouts for app building, data science, and great partner solutions. Thanks! And enjoy the rest of our submit. summit. >> NIKITA NAMJOSHI: I'm Nikita,. >> DALE MARKOWITZ: And I'm Dale. >> NIKITA NAMJOSHI: Today you heard about how generative AI is a powerful tool for developers. Now we're going to take a look at how you can get started building, hands-on with Google's multimodal model, Gemini. Let's dive in. >> DALE MARKOWITZ: Generative AI allows us to solve all sorts of problems that were difficult for computers to tackle before. One is providing answers for complexed nuanced questions phrased in natural language. Think of as the model's ability to explain something. For example, if you've ever asked an LM a question like, why isn't this code working? Or, how can I explain a combustion engine to a 5-year-old? You've experienced this capability firsthand. What's really neat is its multimodal support, which allows us to ask questions too hard to put into words. We need a picture or video to explain. Like, I can send Gemini this picture with a text, what is this object and how is it used? Gemini will recognize that this is a sundial, a device that uses the sun's position to tell time. Another great way to use this feature, helping users trouble-shoot. Not just software, but also life. Like, let's say I want to build an app that helps people take better care of their plants and trouble-shoot. I want to upload a picture of a plant not looking too great and know what I should be doing to keep it healthy. Let's prototype this functionality with Gemini. I'll jump into Vertex AI studio and open a new prompt and upload an image. The image of my plant. Then I'll say, identify this plant and what's wrong with it. I'll hit submit, and I'm going to use the Gemini Pro, it will take a second, and it's receiving its input, and it will tell me that this is a fill Oden Deron, which is the correct type of plant and it's suffering from root rot, and it gives me tips on what I can do. Within a few minutes, without custom model training or anything, I was able to build a tool that answers a complex question and responds in natural language. Let's use Gemini to answer a different user question. The one that is necessarily have a right answer. What piece of furniture will look best in my house? You can imagine especially for e-commerce applications that we sometimes want to ask questions that compare a multiple different images. For example, I want to send Gemini a picture of my living room and all the different cabinets I'm considering buying and ask for design advice. Here's a prompt that does that. I'll send Gemini three images of cabinets, all the options, and picture of my living room and ask which one looks bis. We'll go in on the same request, and I'm use the Gemini Pro model, and let's see what it thinks would look best here. Gemini thinks option three would look best because the other two cabinets are too ornate, but the modern style of the third fits in great. There you have it. No matter what you build, you know your users will have questions, whether they're trying to figure out which product to buy or understand the chart or trouble-shoot a broken appliance, LLMs are a great way to answer those questions. >> NIKITA NAMJOSHI: Let's look at another use case we're going to call extract. We mean using Large Language Models to extract useful information from data that's otherwise difficult to work with and code. Like PDFs, video K. images, transcripts and any kind of unstructured text. So let's say you want to build an app that helps you analyze the check and split the bill. Or take data from different sources that have a slightly different scheme and standardize them. Or maybe you want to take messy text and extract structured data from it. So let's say we would be to build an app that takes customer food orders at a drive-through window, we could use something like the speech-to-text API to do this, we would transcript what the customer is ordering, but the problem here is that most people tend to be wordy when they order. They might use a lot of ums or change their mind or give complicated instructions, like I want another latte but this time with oat milk. We can use a Large Language Model like Gemini to take a m messy instruction. In Vertex AI Studio I'm going to open a new text prompt and instruct the model to extract the items from this transcript in JSON format, and I'll also specify to separate drinks and food. Then next I will paste in a transcript of the conversation, I'm going to use the Gemini Pro model and funnel this to the model, and you can see the model extracts the key information and structures it in JSON, which is easy for my application code to use. The model also identifies some nuances in the transcript, like where the customer changed the sizing for one of the beverages, but kept it the same for another. JSON is a useful format for application development, but we can change the output format. Let's specify instead we want this transcript in table format. We'll send all this to the model again, and this time just by changing the prompt a little bit, we should be able to get a different output. The model is just taking this information and putting it into mark Indian, and there we go. We have our table with our items, the different sizes, and options. >> DALE MARKOWITZ: Nikita showed you how Gemini can help extract useful data from unstructured text, but video files is more cumbersome, large and consist of unstructured audio and imagery. Luckily Gemini supports video inputs too. Let's take a look. Here I have a commercial for the Pixel 8 Pro. [ playing captioned video ] >> DALE MARKOWITZ: In Vertex AI Studio, I can upload that video directly, I'm going to send this text and video prompt to Gemini using the same Gemini vision pro vision model, and I'm asking it a couple questions. I'm also asking it to write a description of the video and requesting that it give me everything in scwa. SON -- JASON format. It's going to take longer than the imagery and text response, because the video is a bigger file, there's multiple image frames. We'll see what it says. All right. What language it was in, and generates a nice description in JSON format. There's so much unstructured data in the world, and LLMs are a great tool to make that data easier to use and understand. >> NIKITA NAMJOSHI: Another interesting application is when we want to transform the modality or style of something entirely. If you have ever used a tooling like Google Translate you've used a Large Language Model to do just that. And with Gemini, you can quickly prototype all types of creative text-to-text transformations, like transforming code from Python to JavaScript, or even from Shakespearean English to modern English. Another common text-to-text transformation is summarization. So, for example, we can use a Large Language Model to summarize the transcript of a long meeting, or we can summarize a list of user product reviews. In fact, let's see how we might do that in code. Here in big query I have a bunch of product reviews. Now, normally when I'm shopping, I want to get a sense of what others are saying before I buy something. But I don't want to have to read through all of the many reviews. So instead, I would use a Large Language Model to create a summary review. I'm going to jump into a Python notebook and the first thing I'll do is import Gemini using the Vertex AI Python SDK. Then I'm going to load my data, extract out these reviews from big query, into a data frame, and when I have that information, I'll go ahead and define my prompt. I'm going to say, write a summary review from this list of user reviews. Then I will paste in all of my reviews right here, and note that I am prioritizing reviews so I can load them in from big query. Once the prompt is defined I can send this to the model and we can print out the response. The model generates a summary review for my product and now that I have this prompt ready, I can generate summaries for all of the other products in my inventory and surface them in my application. >> DALE MARKOWITZ: To recap K. so far we've talked about explaining, extracting, and transforming data with generative AI, one of the most common use cases are chatbots. Maybe you've chatted with a general purpose khat bot. As developers we can make chatbots more useful for custom applications by connecting them to external data sources and tools. Like let's say we want to build a chatbot to help book a trip. We Maybe we need to connect to n API that our bot can respond with real time information, like flight delays. Or maybe we want to build a chatbot that helps us keep up to date on research. Let's see how we would prototype something like that, a bot that keeps us up to date on the latest ML research by connecting to archive, where tons of AI research is published. Building a chatbot that calls out those tools is a complicated use case because we need to not only keep track of the conversation history between the bot and the user, but figure out a way of deciding when our bot should call our tools. It's called an agent, so let's see what it looks like to build something like an agent in code. Agents can be complicated to implement, but there are tools to make it easy, like the popular open source LLM toolkit Langchain. Here's what it looks like in code to create an agent using Gemini and Langchain. I'm using Langchain to create an agent executer that wraps around Gemini, and this is responsible for doing all of the hard work of connecting Gemini to the archive API and deciding when to call that API, keeping track of conversations, and a whole lot more. Using that code we can build a chatbot that works like this. Here's my talk to archive bot, and I'm going to say, what's the most state of the art way to do RAG? And if you're an LLM developer you probably heard of RAG, very popular. So it does some stuff behind the scenes, we'll look at this in a second, and it responds with an answer. The most up-to-date approach to RAG is to tune the architecture. What was happening when it produced that answer? If I click here, we can see first Gemini actually thought, do I need to use a tool to answer this question? When it it decided yes it called out to archive and got all of these responses as papers, and so its output takes into account all of the things it got from the model. If this is too dense for me, I can ask it to explain like I'm 10. Maybe that's easier for me to parse. That's how you can build an agent icing Gemini and Langchain. You can add more than just one tool and make your agent more powerful. >> NIKITA NAMJOSHI: There you have it. Some of the most useful ways to use generative AI in your applications. We can't wait to hear what unique use cases you'll discover. >> DALE MARKOWITZ: If this was exciting and you want to learn more, we've published samples and notebooks and invested in learning resources for developers looking to get a head start on everything Gemini and generative AI. >> NIKITA NAMJOSHI: Thanks for watching, and we're so excited to see what you build next.