Azure AI Studio Build Your Own Copilot Code First Demo

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Hi everyone, I'm Dan Taylor, a product architect working on Azure AI Studio. In this video, I'll be showing how you can build your own copilot in Azure AI Studio with custom Python code by using the Azure AI SDK and CLI in our hosted VS Code web development environment. Our goal is to build an API that grounds your copilot and your enterprise data, and to do that we'll be creating a chat function that will take a list of chat messages from the user, generate A vector representation of the user's question using an Ada embedding model from Azure Open AI service. We'll then use this vector to perform a vector search on an Azure AI Search index to retrieve relevant documents, and then we'll use those documents to generate a prompt that we send to a GPT 35 Turbo model hosted in Azure Open AI service. We'll deploy this API to a managed online endpoint so that it can be consumed by a web application which is outside of the scope of this video. And then all of this will be hosted in an Azure AI Studio project so that we have a single place to organize all the components of our application. In the video, we're gonna be taking the following steps to start from complete scratch without skipping anything. To deploy this application, first we will go to Azure AI Studio to create an Azure AI resource and project. We'll then open VS Code Web and clone a sample app that we'll start with. Then we'll move on to using the Azure AI CLI to create the model deployments in the Azure AI Search index, and then test the model deployments and search index to make sure they're working. And finally, generate environment variables that our code can use to connect to the Azure AI resources. Then we'll switch over to the Azure AI SDK, where we'll use the SDK to run and evaluate the chat function locally to make sure it's working well. We'll then improve the prompt used in the chat function and evaluate how well those improvements are improving the quality of our Copilot responses. We'll then deploy the chat function to an API, and then we'll just invoke the API and get a streaming Jason response as an example. All right, we're going to get started building everything starting from scratch from ai.azure.com. Let's dive right in. Let's go ahead and head on over to ai.azure.com. From the homepage, we'll click the Build your Own Copilot button to create a new project. Here we'll specify a project name and we'll select to create a new Azure AI resource to host our project. When creating the resource, we'll just pick our subscription and the location that we want to host our resources in, which will be E us. Now we'll go ahead and wait for this project to be created. This project is connected to an Azure AI resource which organizes the compute data and models that we'll need for our solution. A resource can have multiple projects connected to it, which allows you to share your cloud resources amongst different members of your team. Now that we've created our project, we'll go ahead and get started coding by opening the project in VS Code Web. First, we'll need to create a compute that VS Code Web runs on, so we'll go ahead and click Create here. Now that our compute is created, we'll set up an environment for VS Code to run in on this project. Note that you can have different environments and different projects running on the same compute. The environment is basically a container that is available for VS Code to use for working within this project. Now that our environment is ready, we'll go ahead and launch VS Code Web, which will open up in a new browser tab. When VS Code opens, we get a helpful read me telling us about how to use this VS Code Web environment. In particular, I want to point out that there is a personal code folder for us to use for our own personal work for cloning git repos, and a shared folder that has files that everyone that is connected to this project can see. So if we have shared assets or data that we want to share with other people, we can go ahead and move it to that shared folder. And if we created any prompt flows in our AI Studio project, they would also show up here in the shared folder as well. First, we just need to allow VS Code to trust the code in this repo. We'll press CTRL Shift` to open the terminal and we'll go ahead and clone the sample repository that we have in the read me here. So we'll CD into our code folder and we'll go ahead and clone that sample repo and then we'll CD into that sample repo. We can see that there is a helpful README that comes with the sample repository which contains instructions that we can use to install packages and create our application. First, we'll follow the instructions to create a virtual environment for installing packages, and then we'll install the requirements dot TXT file which contains the Azure AI SDK, including the generative package for running evaluation, building indexes and using Prompt Flow. And when the packages are finished installing, we can go ahead and follow the remaining steps in the README file. We will run the AI init command which will create and configure everything we need for our code to run. The first thing that the CLI prompts us to do is it helps us log in and authenticate with the Azure CLI. So we'll see that this supports 2 factor authentication and then we can go ahead. Once this login is completed, we can go back to VS Code Web and continue the steps from there. I can now pick my Azure subscription to use, select the project that we just created, and now we'll be able to create the models from the Azure Open AI service that we need. So first we're going to select a chat model. Let's go ahead and use the GPT 35 Turbo 16K model. It's created a default name for us to use for that deployment. Now we want to select our embeddings deployment, which will be used to vectorize the data from the users, and again, we'll just select the default name. And then for evaluation we'll go ahead and also use the GPT 35 Turbo 16K model, although GPT 4 is also a good choice for better results. We're going to create a new Azure AI Search resource here to host our vector index, and we'll just pick the location E US. We want that in the same place as our project, and we'll go ahead and use the same resource group that was generated for our Azure AI Search resource. And again, we'll just pick the name that the CLI suggested for us here for the Azure AI Search resource. And now the CLI will wire up all of our choices and configurations so that they can be used by our code. First it outputs this config dot Jason file which is used to point the sample repo at the project that we created. And now we'll also run the AI search command to create our search index by ingesting the markdown files that are in the sample repository. So here in the data folder we have all these product catalog informations for this company, Contoso Trek, which is product information for our enterprise that we want to use to ground our copilot in. And so now that we've gone ahead and created that vector search index before we start running custom code, we can use the actual AI chat feature to use the built in chat with data capabilities of the Azure AI Studio just to make sure that everything's working properly. And so we can ask a question like what tent is the most waterproof of the Assistant And we can see if the built in chat with data capabilities are working. And we can see that that is now retrieving information from these markdown files that we just ingested. Awesome. So now that we can tell our resources are all set up correctly, now we want to generate environment variables for use with our code so we can use the AI dev new command to generate a N file and so this N file contains all of the configurations that we just set up with the AI init command and wired that up to our code. If you're a developer, you may be used to creating these N files manually, which is a lot of tedious work. And the CLI has automated all of that for us, which is great. And now that we've configured our environment for our code to run, we can move on to running the sample Copilot application. So first, we're just going to copy this command here, which invokes the run dot PY file. This run dot PY file contains a reference implementation that shows how to use the Azure AI SDK to implement the Copilot. So you can see that we asked that same question that we just ran from the CLI and we can see that in response we got the answer to the question that which tent has the highest waterproof rating? And we also can see the retrieved documents that were returned from our Azure AI Search index. Let's go ahead and take a peek at this run dot PY file to see how this works. So we can see at the top of the run dot PY file we load that end file that was just output by the Azure AI CLI and if we go to the bottom of the file we can see here what we do is we just simply take the question that was passed on the command line and we run this chat completion function with the question as a single message from the user. Now if we go and look at the implementation of this chat completion function in the sample repo, we can see it's actually pretty straightforward. First we take the list of messages from the user. We get the last message in the conversation and we pass that to the getdocuments function. The get documents function then uses the Open AI SDK to 1st embed the user's question into this vector embedding and then we use the Azure AI Search SDK to then do a vector search using that vector query and return the documents back to the user. Once we have those documents we then generate the prompts. The prompt is generated using a Jinja template, which we can take a look at here. That Jinja template contains the instructions to the Azure Opening service model and contains the list of documents and then we return the response back to the user. It's that simple. Now you'll notice that this does pretty much the same thing that the built in chat with data capability did. However, since it's in custom Python code, we can do whatever we want here to add calls out to our own APIs, calls out to retrieve documents from other data stores. We can customize this however we want. We can change the prompts. We can change the parameters. We can really make this our own, which is going to be especially important as we want to take this Copilot to production. And Speaking of production, one critical part of getting your Copilot to be production ready is to evaluate and improve the quality of responses that you get with the Copilot. So we're going to move on to evaluating this Copilot next. And for that we're going to use this evaluation data set which contains a bunch of example questions and answers that we can use to see how well the Copilot is performing. So to run this, we can go ahead and use the evaluation command in the run dot PY file. We can say dash, dash, evaluate. And while that's running, let's go ahead and look at that implementation of the evaluation. This imports the evaluate function from the Azure AI generative SDK package and it loads that sample data set that we are just looking at. And then it runs the evaluation call which takes the chat function as the target with the data set and generates a set of GPT assisted metrics to evaluate the quality. We can see in the output here that for each question we get the answer as well as the metrics in this nice table format. We also can see the evaluation output as Jason files, so we can look at. Let's just turn on word wrapping. Here we can look at for each question, what was the answer that came back from the copilot and what were the retrieved documents as well as a score. So we can use this sort of the debug the answers, but we can also use this URL here to open the evaluation results in Azure AI Studio where we can get a nice visual of all of the inputs and outputs and we can use this to debug and improve. So here we can see the distribution of scores. So here we're calculating a standard set of GPT assisted metrics that help us understand how well is the copilot's response grounded in the information from the retrieved documents. There we've got a score of four, point O 8. We can see how relevant is the answer to the user's question. Here we've got an average score of 4.38 and overall how coherent and natural is the response of the generated text. And here we get a score of four point O 8. We can look at the individual rows and questions that are being asked, so we can see for each question what is the answer provided, what were the retrieved documents in this nice visual here and also the provided truth answer. So as we Scroll down here, we can see that the brand for the trail Master tent the this score is low with ones we can see that the answer, the copilot didn't even attempt to answer the question. So that's maybe one question that we want to be able to improve the answer on. So let's go back and see if we can improve the prompt using our Copilot here. We've had a teammate who's really good at prompt engineering come up with a nice, safe and responsible and helpful prompt. And we're just going to update that prompt and rerun evaluation from the command line. We're going to say dash evaluate. And this time we're going to give it evaluation name of improved prompt so that we can easily keep track of this evaluation result when we go back to the studio. All right, now that that evaluation has completed, we can go back to the studio here. This time we'll just look at our list of evaluations. We can see the history of those evaluations. So we can always go back to previous results. Here we're going to click the two evaluations and then we're going to compare. When we compare, we can see that we can see that the scores with this new prompt have improved significantly overall. And then we can for each of these responses see the differences in the two Co pilots. And as we Scroll down, we can see that in this case that we are actually able to get a better, better answer to that question of what the brand is for the Trail Master tent. Now let's go ahead and deploy this Copilot to an endpoint so that it can be consumed by an external application or website. To do that, we'll go ahead and run the deploy command in the run dot PY and we'll actually provide a deployment name here, Copilot SDK Deployment. So again, this just uses the Azure AI generative SDK to deploy the code in this folder to an endpoint in our Azure AI Studio project. So we can see that it just uses the Deployment class here. That takes the current folder, the conda dot yaml, as a set of packages to install, and it points to the chat function that we just looked at. As well as setting up all of the environment variables, including grabbing secrets from our project so that when the code is deployed in a production environment, it runs just the same as it does when running locally. So that endpoint has been created and the source code has been uploaded. Now we'll just wait for our endpoint and deployment to be ready to use. Now that our deployment has completed, we can run the invoke command on run dot PY and here we can specify the deployment name. This will return the response as a full Jason BLOB. Here we can see the answer as well as those retrieved documents. We can also specify the stream command which will return the response in small individual pieces which can be used by a interactive web browser to show the answer as it's coming back in individual characters and those characters are visible in the content parameter of each row of the Jason response. That was a quick tour of how you can build your own custom copilot using code with Azure AI Studio. Be sure to head on over to ai.azure.com to get started today, and you can also check out some of the resources on the screen to deep dive into more information. Thank you for watching.

Info

Channel: Microsoft Azure

Views: 18,319

Rating: undefined out of 5

Keywords: Azure AI Studio, Azure OpenAI Service, Generative AI, GenAI, Code Frist, azure ai, Microsoft, Azure, microsoft azure, azure ai studio, ai azure

Id: dSUWCbFnQ14

Channel Id: undefined

Length: 19min 26sec (1166 seconds)

Published: Wed Feb 14 2024