How to automate Google Workspace tasks with Gemini

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[MUSIC PLAYING] MARK MCDONALD: Hi, everybody. Welcome to our workshop. AUDIENCE: Woo! MARK MCDONALD: Woo! Thank you. Love that energy. [APPLAUSE] Nice. After lunch, too. I would be in a coma. So my name is Mark, and this is Wei. We're on the AI DevRel or Gemini DevRel teams at Google. Joined here with us today with our TA, Kara. She's on the Workspace DevRel team. This is a recorded workshop, so you won't be able to ask questions interactively as we go. But if you do need a hand, you can raise your hand up, and Kara will be able to come and help out. So the format today is a Codelab that we've written up for this talk. And if you load this up on your machines, you'll be able to follow along. The most important part here is being able to copy and paste all of the code. There are some blocks where we kind of move through fairly quickly 10 or 20 lines. And so if you can just copy and paste, you'll be able to stay-- keep the pace. Or if we're going too fast or too slow, you can move ahead or behind as you need to. Once no more phones up-- so I think everybody's probably familiar with what Gemini is by this point of the conference. It's our family of large language models and is-- multi-modal large language models, I should say. And today, we're going to be using the Gemini API to work with Workspace tasks in Apps Script to automate some common workflows. So for today's talk, the main thing we need to know about Gemini is that at a sort of functional level, the model works even below the API. It works as a next token predictor. So you give it some text that's encoded as tokens in a sequence, and it will give you probabilities on what the next token should be. So you can exploit this functionality to do things like put in a question and have it kind of autocomplete the answer to the question, or put in documents and tasks to summarize the document. You can brainstorm ideas. And these models are multi-modal as well. So as well as just using text, we can put in image data, audio, video and perform the same kind of tasks. So today, we'll be analyzing some charts in a spreadsheet, but you can also use the multimodal capabilities to do things like take photos of things in the real world and map them into spreadsheets or documents if you need to. So for today, the format is going to be I'll show you a couple of things in AI Studio, and we'll test out our key. We'll move across and start writing some code in Apps Script that will start to connect to Gemini and to the Gemini API. And then we'll start building some tools on top of that to do some more advanced things within Apps Script and Workspace. And so by the end of this session, we'll have a fully automated chat bot, or agent, you could say, that takes open-ended user text and performs some task in what appears to the user to be like a single, fully automated step. So the first place we want to start is AI Studio. Open this up now. We're going to grab an API key as the first thing. What you would normally do when working with the Gemini API is start with AI Studio. In AI Studio, which is the web interface to the API, you can practice writing prompts, explore different prompts, and see how wording changes affect the output. You can save your prompts to build up a library of different tasks, explore different models and different model settings, and see how they work and what works for your applications. So here's one we prepared earlier. So if you open up AI Studio-- if this is your first time, you'll see a welcome dialog. You'll need to agree to some terms. I've already done this, so I'm good to go. When you're prompted or if you're on this screen, click get API key. That's fine. And you'll be able to create an API key on this step. So we hit Create API key. If you already have a Google Cloud project setup, you'll be able to choose one from this list here. If this is your first time with this account creating a Google Cloud project, you'll have another button just here where you can select create a new API key in a new project. So either way, create a key. Should only take a second. You can hit Copy. Keep this tab open because we'll need to come back here a couple of times. But for now, we'll get on to testing our key. So I'm going to switch over to a command line here and use a program called curl. If you have a command line, and you have curl already, and you know how to use it, feel free to follow along. All I'm doing here is just testing out the key. So this is just a test just to make sure everything works. If you aren't familiar with curl, or don't have it set up, don't go out of your way to install it because it'll take too long. So what we're going to do is set up our key in an environment variable, and paste it in here, and then just copy this command here. What this does is makes an HTTP request to our Gemini API endpoint-- specifically, the /models endpoint. And when we make a GET request here, this will just list all of the models that are available to us with our API key. Presenter-- and that was super quick. We see we get a JSON response back, and it contains a list of objects that describe all of the models that are available. So we've got things like Gemini 1.0 Pro Vision. We get a display name, we get a detailed description, as well as some information about the features that the model supports. So a number of tokens in, temperature, and other default parameters. So if you see this, it means it worked. But in practice, what we want to do is start making an actual request to generate some content. So if we scroll down a little bit in the Codelab, you'll see a request here on the left. This is what a generate content request looks like. So you can see there are a few parts here. The first item in the object is content. So this is a list of turns in a conversation. So in a simple text in, text out kind of situation like we're doing here, there's only one entry in this list. It's just a single turn. If you have a chat conversation or a multi-turn conversation, this is where you would-- you would have each turn as an entry in this list. Then inside here, we have parts. So each of these parts is like a chunk in the conversation. Here, we just have a simple text part. When you're adding multimedia to your conversations, to your prompts, you would add extra entries in here. So you'd have a text, you'd have images, you'd have audio, and any other entries in there. So because this is more than one line, we're going to just put it all into a text file. So I like to use cat because it's pretty straightforward-- presentation.txt. Paste that in, and that's done. So now, we're going to make a similar request. So this time, we're using curl to make a JSON request. It's a POST. And we're using this file we just created, presentation.txt. Now, this is the same endpoint as before, except this time we're specifying models/gemini-1.0-pro-latest. So for this Codelab, we're using the 1.0 series of models. We have a whole bunch of 1.5 models that are available and Flash that was announced yesterday. 1.0 is still very capable and has really generous quota and usage tiers. So we're going to use that today just because it's simple, it's quick, and it's cheap to use. So on this model, we're going to call generate content and, obviously, specify our API key. So let's press enter and see what happens. OK, so we've got a response back. Here, you can see in the response, we've got a list of candidates, contents, and parts. And then the text is the actual response to what we put in. So we've got a markdown formatted string here. Looks great. We get some extra fields back in the response. So here, we get a finished reason, for example. This will tell us whether it just stopped generating because it had finished or whether there was something else that happened here. So if we hit a safety filter, it would specify here. If we hit-- like we ran out of tokens compared to the number that we had requested, that would be specified here. We get some extra information back, like the safety classification for our response. So if something was a little bit spicy, it would be captured here, and you'd be able to see why that was. And we get some information about the tokens that we used in generating this request and response pair. So this is useful when you want to track how much you're using, especially if you have billing enabled and you're paying for each request. But for a test, this is done. We know this works. If you want to see what the response looked like formatted in rich text, that's it here. But now, we're going to switch across to Apps Script. Copy this again. Now, to start editing in Apps Script, there's a handy shortcut URL. You can just type in script.new and load it up, and this will load up the web editor for Apps Script. So here, we get a file that's been created. We've got a little function stub where we can start writing. But what we want to do is we're going to start writing up a utility library. So the first thing we're going to do is-- the first thing we're going to do is name our project. So we'll go IO 2024. And then we're going to rename the file from code to utils. And we're going to put our own code in here, so we'll just delete the entry in here. Make sure that's visible. I can make this a little bit bigger. Yeah, that's good. So the first thing we want to do in the code is to specify our API key and our endpoint. We don't want to put the key in our code, though. We want to be able to ship our code to colleagues. We want to be able to pass the source code around without having to worry about our key being used amongst everybody, or it being shared publicly, or anything like that. It's a secret, and we want to keep it as such. And so the way we do this in Apps Script is through project properties. So in here, in Project Settings on the left with the cog, and down the very bottom, we have Script Properties. So we can click Add script property, type in Google API key just like before, and paste in-- oops, that's not the key. Refresh here, copy, paste that in, and save. And now, we've got a key specified in the properties, and we can get to these through the code. So we'll copy across our first code snippet. Paste that in, and I'll show you what we've got here. So these first two lines load up the property service and pull out this key that we just specified. So if a colleague has a copy, and they've updated to their own key, this will pull theirs out. And then all we're doing is specifying the endpoint that we're going to use here. This is gemini-1.0-pro-latest, and we plug our API key in here. So we'll use that in our request coming up. Now, the first thing we want to do for this project is to set up a function to call Gemini itself. I'm going to copy and paste this function, callGemini. And this takes two arguments. We've got a prompt, like a text input, that comes in, and we can optionally specify a temperature. The temperature defines how creative or how variable the output can be. So a temperature of 0 is going to be consistent and give you the same results every time, but higher numbers will make it more variable. So this payload is exactly the same as what we just saw with curl. We've got the contents, parts and text. We just wire up the prompt from the function into this object. And the same with the temperature. We specify that in the generation config. As with curl, we have POST, JSON, and we put our payload in. And then we use the UrlFetchApp-- it's part of Apps Script-- to make that HTTP request. And as a bit of a shortcut helper, we have this content that are returned that just takes the first candidate, the first part and the text field, and returns that to the user. This makes our callGemini function like a simple text in, text out direct function. Now, to test this in Apps Script, we need a function that doesn't have any arguments. So we paste in this test Gemini function. This takes nothing. We've just kind of hard coded the prompt in here, but this will allow us to test it in the UI directly. So with the prompt-- the best thing about sliced bread is-- and then we call Gemini and log both the prompt and the output. So to test this, we want to hit Save. And now, we've got the function available in this dropdown here, your testGemini, and press run. Now, because we have started connecting to an external service-- this is the UrlFetchApp-- we will need to authorize. So any time you add new permissions or external services that might access the internet or might access private data within Workspace, your app needs permission to do that. So you need to authorize it, so that's what we're going to do here. And we'll need to do it each time we add a new service as we go through this. We see this kind of scary warning screen because this is just in dev mode, and we haven't set up OAuth yet to get rid of this. We can set up OAuth in the Cloud Console. There's an extra setup flow you can go through. But when you publish your app into the Workspace Marketplace, that'll take care of this as well. So because we're just in dev mode, and we're writing this code ourself, we can just click to proceed. We trust ourselves, right? And now it'll run. So you can see the response is the best thing since sliced bread is pocketed bread. Sure. So that was text. We're going to try images now. So with the Gemini 1.0 models, images and-- sorry, Vision use a different endpoint. So we're going to go up the top of the file and add in this Gemini Pro Vision endpoint. Same as the previous one, except we say Pro Vision. And to use that, we're going to build a Gemini Pro Vision utility function. Scroll, copy, drop that in. So this is very similar to the previous function. We've got a callGeminiProVision. We give it a prompt and temperature like before, but now we're also passing in an image object. And this will form part of the prompt. Now, in order to pass an image to the Gemini API, it's a text HTTP request, so we need to encode it using Base64 encoding. This works perfectly fine for requests that are up to about four meg in size. And when you hit that limit, you can still go over it. The API supports higher numbers than that. You just need to use a files API. You upload your content first, so for video, or for longer audio files, or numerous images. And then you can use them in the prompt. But for everything we're doing today, single images, it all fits in, and this is the simplest. So you can see here, we have parts, text part, and this is our image part. So everything else is the same here. The JSON request that we post and return the text field from the first candidate. So that should do images. Let's test it out. So here, we have a prompt-- provide a fun fact about this object. And then here, we're using the UrlFetchApp to download this image, and then we'll encode it and send it in the request. So let's take a look at what this is. It's a pipe organ. This should be fun. So we call the Gemini Pro Vision and then log the input and output. Let's save it, testGeminiVision, and press Run. So here, it's downloading the image and then it's encoding it and sending it to Gemini. So here we go. Provide a fun fact about the object-- largest pipe organ in the world is located in Atlantic City. Wow, very fun. Cool. So that's text and images. The next kind of power feature we're going to add in here is using Gemini with tools. So specifying tools allows us to provide access to APIs or functions that we have available on our side of the system. And we tell the model about them. And if it needs to use these functions, it's able to-- instead of returning text to you, it'll say, hey, call out this function for me. You call it, return the response, and then the model will continue on with the conversation or request that you initially had. So this allows you to do things like calling your own private internal APIs. You don't need to expose them to Google. You can do it indirectly yourself. You can connect this to public APIs, or you don't even need to call a function at all. You can use this to generate kind of structured data from the API. Let's plug it in. We'll take the callGeminiWithTools function, paste that in. Same pattern as everything else so far. We've got the prompt and temperature. This time, we're specifying a tools object. So this goes into the top level of the request. So at any point in the conversation, the model can kind of call out to these tools. Everything else is the same. We use the text endpoint. Except down here, instead of returning the text field from the conversation-- from the returned part, we return this functionCall object. So this defines the function that the model has asked us to call, and then we'll return it and continue on. To give you an idea of what this looks like, the test function has one built in. So with this prompt, the request is to tell me how many days there are left in this month. So this is something that the model can't answer immediately without other information. So the large language models are trained at a specific point in time, and they have a knowledge up to a point in time, and they don't really know or learn anything beyond that unless we give them kind of extra explicit information. So that's what function calling can help us do-- what these tools can help us do. And here, we're passing in a function called datetime. And so this just returns the current date and time, like now, as a formatted string. So hopefully, the model will say, OK, I need to call this before I can answer your question and continue on. So this is the OpenAPI specification. So we've given it name, description, and parameters. This one only has a return type that's a string. But if there were other arguments that the model needed to provide when calling it, they would be specified here as well. Much like writing code, I'm sure you guys all write great docstrings with your code. It's really important that we do this with large language models and tool calling because this text description is how the model uses-- how it decides, I guess to actually invoke this function. So here, we'll just call that, and we'll log the output like before. Let's save. Run, choose, and run. And you can see here this is our prompt, and it's returned back an object saying call datetime. So in a practical situation-- and in a second, you'll do this-- you would then call the function and provide the result back and continue on. This is just a test. This is proving that it works. So we'll connect the real function in a second. And I think with that, I'll hand over to Wei to help you through that. WEI WEI: All right thanks, Mark. So now you understand how function calling works. And then we can actually integrate the Gemini API against different products. So in the rest of this workshop, we're going to build three integrations against the Google Workspace products, including Google Slides, Gmail, Google Drive, and so on. So here's a high-level diagram. If you scroll down in the Codelab into section seven, then here's the diagram that describes the high-level architecture of what we're working on. So at the end of this workshop, we're going to have a system that takes in a user query. This user query could be in plain English. For example, set up a meeting at 9:00 AM tomorrow with someone. And then we're going to have this little routing mechanism or dispatching mechanism which uses function calling, which you just learned about, and then dispatch that user query to three separate tools. The first tool is to set up a meeting automatically using the Gemini API, and the second tool will draft an email automatically in Gmail. And then lastly, the third one will create a skeleton deck automatically. So that's the high-level ideas. And to make this workshop even more interesting for you guys, within each tool, we're going to make a second Gemini API call. So for example, in the first tool, we are going to call the Gemini 1.0 Pro model to summarize a blog and then put that summary into a meeting description. And in the second use case, we're going to use the Gemini Pro Vision API to analyze a chart embedded inside a spreadsheet. And then we're going to ask the Gemini model to compose an email based on its analysis of the chart. And then lastly, we're going to use the Gemini API to help us brainstorm ideas about a given topic. And then we're going to put all those bullet points-- all those talking points into a skeleton deck. So if you think about it, this is more like chaining two Gemini calls in a sequence. First, we call function calling, and then second, depending on which tool we use, then we'll make a second call. And in terms of coding, specifically, for each tool, we need to do three things. We need to have an if-else block as our dispatching mechanism. And we'll dispatch the user query to one of those tools and based on the function calling. Second, we need to actually write code to implement the functionality of each tool. And then lastly, when we invoke function calling, we have to declare the list of tools with the Gemini model so that it knows the existence of the tools and can make the right return. So that's the quick description of what we're going to build. And now, let's move into actual coding. So the first use case, as I mentioned, is to use plain English, and then that will use our system to automatically set up a meeting. And then while setting up the meeting, the system will automatically summarize a blog and then attach the summary into our meeting description. So in order to demonstrate that, go ahead and click this download. This is a text file link. So this is the text copy of the Gemini 1.5 launch blog. It has a lot of text, so we don't want to put everything in your meeting description, which is why we're going to use Gemini API to summarize it. So go ahead and click Download, and you will go ahead and download that. Once that's downloaded on your computer, switch to your Google Drive, and then either drag that file into one of your folders in Google Drive, or you can click New here and then choose file upload. It will ask you to pick the file and then finish the uploading. So I already have the file in my folder, so I won't be doing that. But in your case, you should finish the upload before moving forward. So that's the first step. We have now a sample-- a text file. And then the next step is to implement our dispatching logic. So go ahead and copy the code in step number three in section 8. Now, we can switch back to our code editor. So we are going to create a new file called-- let's call it main. And this time, we move the boilerplate code and paste the main file-- paste the main function in there. And as you can see, there are a lot of different codes. The first one-- the first line is basically a mock user query. So we're going to use this mock user query as a test. We're asking our system to set up a meeting at 10:00 AM tomorrow with Helen to discuss the news in the Gemini 1.5 blog file. So that's our mock user query. And we then will use our Gemini API, the function calling tool-- feature, sorry. The function calling feature to dispatch the query. So in this case, we're sending this mock user query along with a list of Workspace tools, which we'll define in a minute. And lastly, we have this if-else block as our dispatching logic. If the function calling response asks to call this setupMeeting tool then we'll go ahead and call this setupMeeting function, which takes in the meeting time, the recipient or participant, and also a file name, which is our blog file name. So that's our dispatching logic. And now, we have this code there already. Now, the second step is to implement the actual functionality of our setupMeeting tool. Before doing that, we need to set up the Apps Script configuration. So we need to add a new advanced Google Calendar service. So go ahead and click this little plus sign by Services tab and find Google Calendar in the list. And it's right there. And then you can add that. Very straightforward. This is needed because we're going to use some advanced functionality for Google Calendar API. Now, moving on to step number five, copy over this code. Change over to utils file and scroll to the bottom of the file. This is our actual tool functionality implementation. And as you can see, this function takes in meeting time, recipients, and file name. So first, we're going to use Google Drive API to find the text file you just uploaded into your Google drive. And then we'll extract all the text from that file using this function. And then next, we're going to ask the Gemini API to summarize all the text in that file using this really simple prompt. Basically, we're asking the model just to give a summary and the title. But there's one thing here, which is we're prompting the model to return a JSON to us. This is needed because we're using Gemini 1.0. If you use newer models-- for example, Gemini 1.5 Pro-- then there's a new JSON mode you can use, which will directly return a JSON, and you don't have to explicitly do this kind of prompting anymore. Once we get the response back, we're going to do some post-processing, and then we'll extract the title and the summary from the Gemini function-- the Gemini return. And now, we have the title of the file and also the summary of the file. Then it will go ahead and create the calendar event in my Google Calendar. But if you haven't used the Google Calendar as much, so here's a quick UI demo. So this is how it looks like in the UI. And there are a lot of time slots you can choose. You can pretty much pick any time you want, and then you can change the meeting title. You can add guests or participants. And you can add a more detailed description. You can attach a file to your meeting as well. So usually, you would type everything up manually, and then it will be set up once you click hit Save. In this case, we're going to use the API to do all of that for us. In this case, we're calling Calendar app, and then ask the Calendar app to create a event. In this case, it's a meeting. And then we're going to say, meet somebody at what time and to discuss a certain topic. And the Google Calendar app will go ahead and do that. And next, we're going to change the meeting description using the file summary returned from the Gemini API. And then lastly, we're going to call attachFileToMeeting, basically, to just attach the text file in your Google Drive to that meeting. And this function is actually implemented, if you scroll up a little bit right here. So this function, we won't go into the details of all these API calls, but basically, it just attached the file to the meeting using advanced Google Calendar service, which we just enabled a few minutes ago. And also this function was actually implemented by our TA, Kara, so thanks for that. So that's the second step to implement that. And one last thing. Remember, I said three things we need to do to implement the tool, which is to declare a list of available tools when we invoke function calling. So go ahead and copy the code block in step number 6. And then scroll up in your utils file to the top and then paste that in. So let's look at this code. Basically, it just defines a list of functions or tools. In this case, our first tool is called setupMeeting. And here's a really short description. And it takes in three parameters-- a meeting time, recipients, and also file name. And all of them are required. So that's step number 3, and we have finished all of them. Now, we're ready to run the test. Go ahead and switch back to main.gs file and then click Run. It will ask me to authorize the permission again. Go ahead and do that and allow access to my Gmail account. So it will go ahead and trigger two separate Gemini API calls. It will take a few seconds before it comes back, so let's give it a few seconds. So in the log, you can see we printed out a message that says, your meeting has been set up. So at this point, you can switch to your Google Calendar. And as you can see, there's a meeting set up on my Google Calendar. If you click that, you can see we have a title there. We have a detailed description, which is a summary of the blog file we just uploaded. And we have the attachments, which is the text file from your Google Drive. So that's our first use case. As you can see, from the user standpoint, we only provide one simple English sentence to set up the meeting, and then everything is done automatically for us. So that's our first use case. Now, let's move on to the second use case, which is to draft an email for us automatically. And the email body will be based on the Gemini Pro Vision's analysis of our charts. So to demonstrate that, we have created a spreadsheet. So this is a spreadsheet that I created. It's about college expenses. I basically made up of these numbers. So these numbers are definitely just for demonstration purposes. They are not real numbers. But for our demo purposes, I think it's sufficient. So in this case, let's say I'm doing some kind of a data analysis. And then I create a chart based on these numbers. And we're going to send the image of this chart to the Gemini Pro Vision API and ask the model to do analysis for us and then Compose an email based on that analysis. So go ahead and save a copy of this spreadsheet into your Google Drive. Once you do that, make sure to remove "copy of" in the file name, and then go ahead and make a copy. You should have a spreadsheet in your Google Drive. I already have a copy there, so I won't be doing that. And so that's our demo spreadsheet with the chart embedded. And now, we're coming back to implement the three steps that I mentioned at the beginning. So first, we need to add the dispatching logic again because this is a separate, new tool. So go ahead and copy this else-if block, and come back to your main.gs file in the editor, and then paste that in your if-else block. So in this case, if the function calling response asks us to call the draft email tool, we'll go ahead and call that function using the spreadsheet name and also a recipient because we are drafting an email. And so that's our step number one. And then second, we're going to actually implement the functionality of this tool. In this case, copy the code block in step number four. And now you can change over to utils file, and scroll down to the bottom of this file, and then paste that in. So now, let's look at what this function is doing. So basically, we are using a prompt that asks the Gemini Pro Vision model to compose an email body for someone based on the model's analysis for this particular chart embedded inside our spreadsheet. And then we added some additional prompts to help the model output some valuable information. And then again, we're going to use Google Drive app to find the spreadsheet file, and we're going to identify the chart embedded in that spreadsheet file. And now, we have the chart, but the chart is actually an object. It's not an image. So we have to save the file into your Google Drive as a PNG file. Later on, we will use this PNG file in a minute. So now, we are finally ready to call the Gemini Pro Vision API using this prompt we defined up here. And then we send the chart in to the model. So we're not sending any of the numbers or texts inside that spreadsheet, just a pure image to the model. And once we get back the email body, the drafted email body composed by the model, then we're going to use the Gmail app to create the email draft. And in this case, we're adding the recipients. And also, of course, I'm using a demo email provider domain. And there's an email title. This one is hardcoded here, but you can usually change that. There's an email body from the Gemini API response. And lastly, we're attaching the chart image. Remember, we just saved the chart image in our Google Drive right here. And we are attaching that file from Google Drive to your email. And then that's our step number two, which is implement the tool functionality. Now, we are doing the final step, which is to declare the tool when we invoke function calling. So go ahead and copy the code in step number five. And then scroll up to the top of your utils.gs file and paste that file in right after the comment, "add your tools here." So I probably missed something right there, but let me-- yes, you just need this code, not the entire code block. OK, so there's no more error anymore. So let's look at this part of the code. It basically declares a new tool called a draft email. And this function will take in two parameters, expression name and also the email recipient. And so that's our step number three. And with that, we are ready to do a test. If you scroll up to step number three, we need to change the mock user query before we actually run it. So go ahead and copy out the previous user query and paste in this one. So this one just asks our system to draft an email for Mary with insights from the chart in that spreadsheet. And it's pretty straightforward English. And now, we're ready to run it. Go ahead and click Run. Again, it will ask me to authorize this, and then we'll allow access to my Gmail account. Then it will kick off running, and it takes a few seconds to finish. And let's give it a second. And this time-- remember, this time, we are sending an image, so it's usually bigger than your text prompt. So now, it looks like-- looking at the log, it seems we already have the system successfully created a email draft for us. Now, if you open up your Gmail account and go to your Drafts folder and click Refresh, then we have a nicely written email all written by the Gemini API. We didn't type anything there. You can see the email address, the title of the email, the recipient. And this whole email body is written by Gmail. And all you need to do is to review the copy and then make some minor changes. The image, the chart image, is also automatically attached. And it saves a lot of time for you. So that's the second integration we want to demonstrate to you. And there's a third one, which is to use the Gemini API to brainstorm on a topic and then come up with interesting bullet points and talking points for you. And at the end, you will have a basic spreadsheet. So this is a quick GIF demo of the spreadsheet that it would create for you. But in the interest of time, since this workshop is only 45 minutes, we are not going to go through this part of the integration for you. But please feel free to check it out after you get home. It's actually easier than the two integrations we have just built. Now, in this workshop, we have demonstrated a couple of integrations for you, but there are, of course, a lot of ideas and possibilities you can try. And for example, you can easily build a chat bot for Google Chat. Google Chat is a messaging app, if you haven't used it. Because large language models are great at being a chat bot, so you can easily do that. Another thing you can try is you can try some advanced techniques, like RAG-- Retrieval-Augmented Generation-- with your files in Google Drive or even Google Keep. And the reason this technique is needed is when you have a lot of stuff, you just can't pack everything into a single prompt. Even with the announcement of our two-million token context window model, sometimes, still you want to just send the most relevant context to the model when you work with the Gemini model. So that's when you want to use something called a RAG. But we're not going to do a deep dive on RAG. There are dedicated session at I/O this year, so you can check them out and learn about the techniques behind that. Another thing you can try is to use the multi-turn function calling feature. So the integrations we have demoed only uses a single turn. We asked the model to return a response for function calling, and then that's it. We're not using function calling anymore. But in reality, you can do this in a multi-turn fashion. We're not going through the details of this, but here's a link. You can read all about it in our documentation. And then lastly, in this workshop, we have only demonstrated integrating with Google Workspace. But basically, there's really nothing special about Google Workspace. You can pretty much build integrations against any public or even private products or services, and that just opens the doors to so many things. And hopefully, with our little workshop today, you can start thinking about what you can build with the Gemini API. So finally, we have all made it. And just to quickly recap, we walked through how to use the Gemini API, how to leverage the multi-modality feature and function calling feature, and we showed how to build a couple of integrations against Google Workspace. And I want to thank you very much for attending this workshop. And hopefully, you all find this workshop useful. And we can't wait to see what you build in the real world. And thanks again. And enjoy the rest of I/O. Thank you. [MUSIC PLAYING]
Info
Channel: Google Workspace
Views: 2,423
Rating: undefined out of 5
Keywords: pr_pr: Google I/O;, ct:Event - Workshop;, ct:Stack - AI;
Id: k7I_Uu6bsPU
Channel Id: undefined
Length: 43min 25sec (2605 seconds)
Published: Thu May 16 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.