Google Releases AI AGENT BUILDER! 🤖 Worth The Wait?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
all right so Google finally launched an agent platform and we're going to take a look at the announcement right now so this is from Google Cloud next 2024 keynote speech I did a super cut of it but now I want to talk more specifically about it and we're going to watch it together and in a video that I have planned I'm going to show you how to use the vertex AI agent Builder yourself and I've been playing around with it and it's pretty cool all right so let's start with the keynote now let's dive into vertex AI our fast growing Enterprise AI platform in our vertex AI model Garden you can access over 130 models including the latest versions of Gemini par models like Claude from anthropic and popular open models including llama Gemma and mrr all right so first uh the model Garden which seems pretty cool they have a bunch of different models that you can use both open source and closed source and in fact let me just show it to you all right so here it is here's the model garden and we can see it has Gemini imagine Gemma chirp here's Gemini 1.5 Pro so it is really cool that they have all these models in the same place here's stable diffusion Laura they have it filtered by modality so language Vision tabular document they also have it filtered by task so generation classification Etc so if we click into Gemini 1.5 Pro we can see all the information about it use cases documentation this feels like hugging face and it's interesting because hugging face actually showed up at the Google Cloud next keynote but I guess they don't see this as competitive and you can open in vertex AI studio and so that's where you can start playing around with it and here is all the models that they have as they said so here's llama 2 Claude 3 stable diffusion mixol 8 x 7B wizard coder I mean they really have a ton of the top models so really really cool all right let's keep watching you choose the best model for your use case budget and performance needs and switch between models as you need to get today we're taking Gemini 1.5 Pro into public preview all right so this is pretty cool Gemini 1.5 Pro in public preview I've had access to it for a while I've been playing around with it having a million token context window is absolutely insane being able to drop an hourlong video into a prompt and it answer questions about that video is kind of mind-blowing there's an example in there where you can load up a movie and ask it a question about what was on some like note that somebody took out of their pocket in a scene that maybe lasted just a couple dozen frames really really impressive stuff all right let's keep watching Gemini offers the world's largest context window would support for up to 1 million tokens with Gemini 1.5 Pro customers can now process vast amounts of information in a single stream all right I want to pause for a second they're talking about 1 million tokens but it has already leaked that they have 10 million token context Windows internally that they're working on these massive context windows are going to open up brand new use cases and I'm super excited to see how well they work including 1 hour video 11 hours of audio code basis will well over 30,000 lines of code I mean that is a monster use case being able to have 30,000 lines of code in a single context window is really really impressive now of course most mature code bases are well over 30,000 lines of code so there's still going to be a need for mapping out code bases using rag Solutions like pine cone so we're still very far away from being able to put an entire codebase in a single prompt over 700,000 words we're enhancing Gemini 1.5 Pro with the ability to process audio enabling cross modality analysis for instance you can use it to search in audio and video content for example find a timestamp in a baseball game video where a commentator says it's out of here we've seen some amazing examples of what people can do with this large context window Sunda mentioned a few and others include a university Professor is using it to extract data from a 3,000 page document with texts data tables and charts in just a a single shot yeah that's probably one of the coolest use cases just being able to load up huge PDFs huge documents and being able to summarize them easily extract information from them accurately I'm really excited about a million token context window and he also mentioned audio which is really cool I can load up an hour and a half long podcast and ask questions about it and it'll give me answers based on the context of that podcast so very very cool okay so I'm just going to skip ahead a little bit let's keep watching we're also announcing the availability of code Gemma a fine-tuned lightweight open model designed for coding from the same technology used to create Gemini all right so I've used Gemma and frankly it was very unimpressive but I know they just released a new version of Gemma so I definitely have it on my list to test out and look I am appreciative of any company that is releasing open source model so thank you to Google for releasing Gemma and now maybe I need to test code Gemma because now they have a finetuned version of Gemma specific for code let's keep watching with these additions Google Cloud continues to be the only cloud provider to offer widely used first party third parties and open- Source models vertex AI can be used to tune augment manage and monitor these Models All right so yeah I mean Google's really getting in the game now I'm impressed with all of these announcements their model builder allows you to fine-tune allows you to do a whole bunch of stuff with the models but what I really want to know about and what I really want to talk about today is their agent framework so I'm going to skip ahead and we're going to take a look at that Vex AI is the only AI platform to provide a single platform for model tooling and infrastructure now let's look at the types of Agents customers are building on Google Cloud using generative AI all right so now they're going to be talking about customer agents and when I hear agent I think about autogen I think about crew aai I think about agents that are coded given tools given personalities given backgrounds that can work together to accomplish and automate tasks I think when Google is talking about agents they're mostly talking about customer service agents this feels very similar to open ai's Assistance or their custom gpts product it doesn't feel like a fully featured agent framework to me at least not yet but let's take a look and see what they say and I'm also going to show you a little bit of the interface itself first customer agents you know similar to great sales and service people customer agents are able to listen carefully understand your needs recommend the right products and services they work seamlessly across all your channels the web your mobile app your point of sale and your call center and they can be integrated into product experiences with voice and video video mercedesbenz is working with us on customer agents to help people in their amazing cars let's hear from their CEO Ola Kines at Mercedes-Benz we want to offer our customers an exceptional digital experience that's why we're equipping our cars with high-end computers each car should only get better over time just like a good wine and with the power of Google cloud and AI we will make the user experience even more personalized our partnership across Google helps us build more intuitive and customized experiences last year we announced our partnership with Google Maps and today more than 3 million customers are using Google places in their Mercedes cars and we are applying Google Cloud AI across a number of other use cases ranging from a smart sales assistant improving customer service in our call centers and optimizing our marketing the sales assistant for example helps customers to seamlessly interact with Mercedes when booking a test drive or navigating through mercedes's offerings to find their next favorite vehicle and now we're exploring further opportunities to work with Google Cloud AI such as Next Level navigation features in addition we're partnering on one of the most exciting technology Topics in our industry automated driving this beautiful car right here is equipped with a level three system for conditionally automated driving we were the first manufacturer to get it certified in Germany California and Nevada for our next Generation internal development and test platform we will use Google Cloud as the backbone helping us to become even more efficient and flexible in our product development and Google Cloud's expert knowledge in processing massive amounts of data and scaling AI workloads will ensure that our cars get even more intelligent and AI driven partnering with the very best in their respective Fields is an important part of our software strategy and Google is the perfect example of that with Google Cloud Mercedes-Benz is building new ways to deliver the most intelligent vehicles to our customers and to create personalized intuitive experience we're really excited about working together thank you for having me okay this is the biggest missed opportunity I've ever seen why isn't there an agent built into to the infotainment system in the Mercedes that seems like the most obvious use case when you're driving you can't use your hands to text or type or search or do anything you could simply be talking to an agent to accomplish all of these different things for you I don't know why they wouldn't have done that I'm very surprised to see that they just skipped over that super obvious and super valuable use case we're inspired by the agents that customers are creating using a gen generative AI platform and all right so a lot of good brands on here ADT Verizon Target discover Best Buy Etc and they're all building agents but I think they're all basically just customer service Bots which is pretty disappointing that's the most easy obvious simple use case and I really think it speaks to how safe Google is playing it or maybe they're just thinking about it at the Enterprise level but there's really some cuttingedge stuff they could be doing which I wish they were our models InterContinental Hotels group will launch a travel planning capability to help each of you their guests plan their next vacation ADT is building an agent to help customers select and set up home security systems Verizon gives agents better recommendations so these all seem like customer facing Bots whether it's customer service or sales and that's fine that there's definitely a lot of money in those use cases but that's not as exciting to me magalo one of Brazil's largest retailers has put generative AI right at the heart of its customer service ing built a chatbot to enhance self-service and improve answer quality and Target uses AI on the Target app and website and by the way I just want to point out Google has had a product that does all of this for a very long time my previous company used it it was called dialogue flow and it still is a product within the Google cloud services Suite but it was very brittle it was very hard to set up so I understand why they're kind of relaunching these capabilities but still I'm a little disappointed that they're not more future thinking in their capabilities Minnesota's Department of Public Safety helps non-english speakers get licenses and other services with real-time translation Best Buy is building an assistant that will help troubleshoot product issues reschedule or combine order deliveries or manage software discover Financial Services is using search and synthesis across detailed policies and procedures during customer service calls and oranges fr French language agent is grounded in support knowledge transforming their help and contact site and their customer experience Oppo and OnePlus leaders in smart devices are incorporating our Gemini models and Google Cloud AI into their phone to deliver Innovative customer experiences including news audio recording summaries AI toolbox and much much more you know the opportunity for customer customer agents is tremendous to help each of you build customer agents faster we're introducing vertex AI agent Builder you can now create customer agents that are amazingly powerful in just three key steps all right so this is really what the agent Builder is it is not to the level of sophistication of an autogen or a crew AI it's really just a product that seems very similar to custom gpts from open AI first you can use Gemini Pro to create free flowing humanlike conversations with text voice images and video as inputs and personalize them with custom voice models second you can use natural language instructions to control the conversation flow and guide it on specific topics you don't want it to discuss such such as current events in the same way that you train your human agents you can also control when it hands over to a human agent with transcription and summarization of its conversation history to make these transitions extremely smooth third you can improve response quality with vector-based and keyword-based search to connect your internal information and the entire web you can also use extensions to complete tasks for customers like updating contact information booking a flight ordering food and many more and you can integrate Enterprise data from operational databases like allb Predictive Analytics with big quy and SAS applications like service now let's take a look at an example of a customer agent in action please welcome developer Advocate Amanda Lewis thank you Thomas so last night I was watching a video of this band and I love the keyboard player shirt so I was thinking I'd really like to be wearing that shirt tomorrow night but can I find it in my size and in time to be rocking it at the concert here in Vegas let's head over to my favorite store oh this is uh so scripted and Polished it's a little bit cringy they just launched a customer agent and it leverages Gemini and Vector search to deliver a seamless shopping experience all right I I can't get over it I I just I don't want these types of products personally I know they're valuable but they're out there these have already existed for a while and they're talking about it like it's so Cutting Edge customer shopping assistance customer support agents sales agents it's not interesting to me so let me play the rest of this demo and then I'm actually going to show you vertex really quickly and and you're going to understand why I'm a little bit disappointed with Google's announcements today what can we help you find well I'd like that shirt but I guess I have a few other specifications as well so find me a checkered shirt like the keyboard player is wearing I'd like to see prices where to buy it and how soon can I be wearing it going to include the video now the customer all right that's cool I'll give him credit for that being able to just drop a video and say tell me where I can buy the shirt that that person's wearing that is really really cool although again it's just for the shopping use case I would have liked to see something a little bit more future thinking agent is using Gemini's multimodal reasoning to analyze the text and video to identify exactly what I'm looking for then Gemini turns it into a searchable format how cool is this it found the checkered shirt I'm looking for right and some other great options in no time and that's because these results harness Google's trusted search Technologies which ensures customers like me get the right results in record time the suggested products are grounded in Syle Fashion's inventory and historical performance data to make sure customers leave happy and with that purchase in hand okay so I'm going to pause there let me show you vertex aai agent Builder now all right so this is their agent Builder I just want to show it to you quickly I'm going to make a full video all about it but I I want to show it to you because it's really telling about how Google is thinking about agents and it's not how I think about agents so over here we can create a new agent I've already created one weather agent we'll click into it and you give it a name you give it goal and then you can give it instructions one thing that I really do like about it is that the instructions can be very simple and you simply can just list them like this ask the user for their location and then use and then anytime you have a dollar symbol right there you can easily insert agents or tools that interface is very very nice so I simply say ask the user for their location use tool weather and the tool weather is one that I've already created let me show you over here we have our tools okay so I created this weather tool I have it as type function I have no description but you don't need one and then you simply have the input parameter schema and the output parameter schema here's where I'm really confused where's the actual code go I don't see a place to put code anywhere you can put input parameters and output parameters but how do you actually say Okay I want to hit this third party API and this is actually one of the samples that they give and I just don't understand it if you do let me know in the comments but basically where do I actually put it so let's see how it responds okay so I have the weather agent selected right here let's test it out what's the weather in Los Angeles it formatted it properly we have the tool input Fahrenheit Los Angeles California and then the output temperature zero where does it actually get the temperature from submit function output I'm sorry I can't provide weather information this is literally the example that they provide in the dashboard it's very confusing it's definitely not how I think about agents but they're making progress and so I appreciate their efforts so far one thing that I do want to show you that's really cool is you can easily have all of these Integrations by the way here's dialog flow messenger which is that product that I just told you about which is kind of their previous iteration of their agent framework but you can integrate twillo Discord all of these really easily which is super nice but these are basically just tools and so yeah that is the entire vertex AI agent Builder it is essentially just custom gpts by open AI so we'll go create you can list tools and agents and it has a code interpreter you can also add other tools here but again I don't really understand how tools work and so I think this is the code you basically have to format it in this yaml or Json format rather than kind of just pasting in python or whatever language you're most familiar with which is okay it's not great the thing I like about it is it does have built-in authentication which is nice and makes it really easy and you can also have TLS certificates right there but definitely not straightforward to use and I would prefer simp simply just defining a method here and allowing the agents to call that function whenever they need it all right so now I think they're starting to get into something more interesting which is agents in the workplace meaning agents that can actually perform tasks and accomplish things essentially kind of AI employee so let's take a look first you create a custom model in the ways that we've shown before from there you connect them to all your company and web data this can also be done with translation so that your company information is available regardless of language similarly we support multimodal inputs including videos call Audio images in addition to text now you will want to ground that in Enterprise truth using databases like alloy DB big query and data from Enterprise apps like sap and announcing today HubSpot let's take a interesting that they're mentioning HubSpot because it is rumored that Google is going to acquire HubSpot although it is just a rumor right now and that's pretty cool that you can actually feed in all of your HubSpot CRM data into the agent so let's keep watching at an example of an employee agent in action please welcome developer Advocate Gabe Vice thanks Lea hi folks so I know you all want to hear about awesome AI stuff that's coming but I need to talk to you for a minute about my annual benefits enrollment see I forgot I have to finish signing up by today and as you can see I might be a little bit busy so if you don't mind let's go ahead and look at this open enrollment email together okay yep I've got a deadline I knew that thank you I've got FSA stuff I've got an online portal from my company okay there's a lot here uh H they included video let's see if this makes my life easier ah okay so it's almost an hour long yeah I'm not going to have time to review all of this stuff let's see how this employee agent that we've developed using Google workspace Gemini models and vertex AI might be able to help me as you can see it's integrated directly into my Google Chat so I don't have to context switch while I'm figuring all the stuff out first things first let's have it summarize the email and the video that it sent me all right that's awesome I have been wanting to build an automation using AI that can read an email look at all the context from that thread and then all of the context of all of my emails to try to write a draft that I can simply either edit or send and that is kind of my dream cuz I get a ton of emails I wish I had that and I think that's where they're headed with this product summarize the body and attached video from my recent email with subject open enrollment closing so behind the scenes the agent is referencing that email body and its attachments as context in the prompt using retrieval augmented generation that is awesome awesome okay that is very very cool that way its response is limited to the content that matters to me the Gemini model's multimodal capabilities allows the agent to understand and reason across text audio and video from a single prompt I mean this is a way quicker read okay good and I can immediately see that the medical plants have been completely revamped this year let's go ahead and jump into the benefits portal to see more now I've already done my dental and my vision but I procrastinate I mean save the most important plan for last my medical plan let's see how this option Stacks against my existing coverage compare these coverage all right that's really cool that you can basically just invoke a Google drive folder or a Google Drive agent I think and then ask it additional information I'm very impressed with that by the way I didn't see anywhere in the vertex AI agent Builder where I could accomplish something like this I think this is all just built in by Google behind the scenes into their products this isn't something that you'll be able to build but we'll see options to the PDF doc I have on the Platinum plan the Gemini model's long context window paired with vertex extensions enables the agent to cross reference large amounts of data from a variety of sources including unstructured data like PDFs leveraging Gemini's Advanced reasoning capabilities the agent is able to understand the complex details my current plan and compare it with the new options for 2025 and since the Enterprise grounding features links me to the exact data that Gemini used to draw its conclusions which you can see linked here I can confidently trust its recommendation that the gold plan is best for me and done so now let's get a summary of my coverage let's say my house is multilingual so I'd like to have it in Japanese also please generate a summary of 2025 benefits in a Google doc in both English and Japanese although my source material is in English the Gemini model support for over 40 languages enables it to understand and respond in Japanese and here we go all right this is cool again but again this is all stuff that's built into the Google workspace product so very cool I'll definitely be using all of this but I wish they kind of added a lot of functionality into the agent Builder that I could use now that I've officially completed enrollment my daughter's going to need braces this year I'm going to skip over this I get the demo fine the agent knows that I'm at Google Cloud next because it's integrated with yeah so essentially now you have a personal agent to do everything for kind of your work Gmail Google Docs calendar very cool I'll definitely be using it so the next thing that they're going to talk about is a new product in their Google Suite or their Google Docs Suite of products so they have docs spreadsheets they have presentations and now they're going to add video which is really cool let's take a look we believe that everyone can be a great Creator and a great Storyteller but the formats and tools for storytelling at work haven't really changed that much how many times have you heard should we start with a dock or a deck well we can do a lot better I'm absolutely thrilled to announce our newest workpace app Google vids sitting alongside Google Docs sheets slides Google vids is an AI powered video creation app for work with Gemini in bids you have a video writing production and editing assistant allinone let me show you how simple it is to get started with bids now after week with all of you here at next I'm going to want to share a recap video to share all the excitement with my organization when I open up vids Gemini helps me get started I simply type in a prompt using an existing document for context all right that's really cool that you can pass in context that easily so I'm very impressed that everything that they're releasing with their kind of workspace agents seems to be very integrated with itself which is to be expected but it is very cool now based on that prompt Gemini suggests a narrative outline for the story that I could easily customize and edit I choose an expressive style and vids Works its magic so wow just like that I get the first draft with beautifully designed fully animated scenes complete with relevant stock media and music and even a generated script yeah all right that's very cool I wonder where it's pulling the ated stock media so it's not actually creating video AI video but it is kind of pulling together different b-roll and different title sections uh and it's kind of putting the whole thing together so pretty impressive all right so this is something I'm really excited about uh actual agents being able to code with you and I'm hopeful this is going to be really cool because of Gemini's massive context window so let's watch this video so let's take a look at what's coming for code assist with Gemini 1.5 Pro leveraging a 1 million token context window I'm a new developer with symbol Outfitters and today we show recommended products to customers only after they've made an initial selection these suggestions are powered by our custombuilt recommendation service based on previous purchases but now the marketing department has asked me to move this feature to our homepage so that customers can see products that they might be interested in as as soon as they get to our site our design department has created a mockup of what they would want this experience to look like in figma and for the developers out there you know that this means we're going to need to add padding in the homepage modify some views make sure that the configs are changed for our microservices and typically it would take me a week or two to even just get familiarized with our company's code base which has over a 100,000 lines of code across 11 services but now with Gemini Cod assist as a new engineer on the team I can be more productive than ever and can accomplish all of this work in just a matter of minutes this is because Gemini's code Transformations with full codebase awareness allows us to easily reason through our entire codebase and in comparison other models out there can't okay so this looks like VSS code which is kind of interesting given this is Google but I guess this is built into V code this is some kind of extension I'm not sure let's keep watching handle anything beyond 12 to 15,000 lines of code and even then they struggle to get it right Gemini inside of code assist is so intelligent that we can just give it our business requirements including the visual design so let's ask here I am prompting Gemini to add a for you recommendation section on the homepage all right and again very very cool that you can just drop a Google Drive link right into Gemini and it will grab that context so I'm impressed by their ability to just essentially drop any source of information at any time into Gemini along with an image of the future state to show the improved design almost immediately Gemini code assist starts by reasoning about the code changes that it needs to make and has insights an experience teammate would have for example because we asked Gemini Cod assist to change the recommendation service it was able to find the recommendation function and extract out the exact details needed to make the call to the recommendation service it highlights the files needing to be changed and reveals the reasoning behind its recommendations using our own codebase for context Gemini Cod assist doesn't just suggest code edits it provides clear recommendations and make sure that all of these recommendations are aligned with symbol Outfitter security and compliance requirements in code assist we've also added an option to apply the edit which keeps me as the developer and the driver seat so let's take a look at the source code changes that Gemini code assist has made in our code base it looks like we have multiple edits across two files handlers. Go and also home.html Gemini cist even applied these changes to the full repository and to put this in context no pun intended it would have taken me over 70 hours nonstop to even just read through all of these files all right I think that's kind of a little bit of BS marketing talk because you don't necessarily have to read through every single file every single line of code to actually make modifications the code base but fine I understand what she's saying just like I would with any code change my next step is to check the workout by testing out the modified app locally so let's try it and there we go the for you recommendation section is exactly what our marketing team was asking for all right so very cool and this is a very simple marketing page that they're updating so it's kind of a simple use case but I'm excited to try it out anything with AI encoding you know I'm all about I'll definitely make a video about that as well so I think I'm going to call this video right here Google announced some really cool stuff I wish the agent Builder would have been more sophisticated but overall all of the functionality that they're adding into the Google workspace product is very welcome so if you liked this video please consider giving a like And subscribe and I'll see you in the next one
Info
Channel: Matthew Berman
Views: 184,185
Rating: undefined out of 5
Keywords: ai agents, google, ai, llm, agents, artificial intelligence, google cloud next
Id: _AOA6M9Ta2I
Channel Id: undefined
Length: 34min 20sec (2060 seconds)
Published: Fri Apr 12 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.