Productize prompt engineering with Azure Prompt Flow

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] prop engineering is the art of crafting effective prompts for large language models namely chat gbt but prop engineering can be challenging and iterative process to figure out what would be the best prompt or prompts to ask our request through that and you don't want to get stocked in the loop of trial and test process so now is a wonderful moment to say hi to a new feature developing Azure ml called prompt flow which can extremely simplify the process of developing prompt evaluating them comparing them and even operationalizing them through this end-to-end process potentially can be a code or a visualize based solution within Azure ML and guess what you can also use some pre-existing and pre-developed prompts with other Frameworks in your prom engineering projects you must check this new feature then let's go hello everyone this is mg and welcome to another video if you have been dealing with drums and prop engineering then we all know some pain points for example for the initial moment that we start developing the prompts there is no Baseline and guidelines to to help me to understand how can I effectively develop the prompt which is specific for my use case and then on the top then I have all these prompts or flow of prompts developed for my project I need to figure out a way how to evaluate this end-to-end process not just with one two three tests I need to have a sort of a solution that can support me for a bulk test making sure my solution is consistent in aspect of how it is behaving before I productionize it I'm talking about productionization then I need further support to deploy this end-to-end solution let's say developing an API that when I call it it uses open AI with all the prompts we have developed in backend to answer that question based on my internal company data or calling an API for my internal server through how chat GPT is calling that API but you can guess based on the complexity of your use case you have you are going to develop multiple props and flow of a solution that is calling multiple services and one of them is namely your large language model then this is the time to really say hi to a great feature recently cooked and added to azurement called prom flow which fit using properly it will really simplify and even sort of automate all these stages that we talked about which means it can support you starting from creating your prompts using some pre-existing prompts developed by other Frameworks starting evaluating your prompt so evaluating end-to-end the flow of your prongs with even bulk test evaluation and then lastly with the click you can deploy this solution for operational stage and having that as an end-to-end point you must check this new feature then let's check it out before we start make sure you subscribe and hit the Bell icon so you will get notified for the next video alright welcome again everyone to this wonderful new feature added to Azure remark called prompt float that I think it has all you need for making a product an application out of your prompt engineering use cases so when we talk about prompt engineering project or making a product or making application by your company using large language models you are talking about multiple stages in development and multiple requirements not just from development all the way to production and the the actual recent article that has been published I call it recent based on the time that I record this video and I call it the recent feature out in Azure ml again based on the time I'm recording this video this feature added to Azure ml is in private preview mode so it's pretty pretty recent and just get cooked and the time that you're recording and watching this recorded video maybe it's no longer in private preview so I am sharing these um capability and all these features around based on the time that I'm recording this video but honestly I couldn't stop myself to record this video because I was really excited when I saw that this is released this is going to really tackle a lot of pain points that you might have faced up put in potential in your promise in your projects so let me dive into the to the capabilities of what is the proposed value of this feature and then I'll show you some Hands-On stuff in action so when we talk about problem engineering let's say you are developing an application you are a company that you're selling let's say shoes or any specific product and you want to create a chatbot to let your customers ask for a specific product in your company by by conversation let's say hey I want to get a shoes a pair of shoes with this a specific color we teach a specific um softness I want to let's say use it for gym when I'm running when I'm hiking uh with this specific design in my mind so on and so forth and then you want to leverage large language models for example chat Tubidy that is running by GPD 3.5 to get that input from your customer and check out your internal database that you have list of shoes and return one back based on the user input this is just one simple use case or another use case we already created multiple videos for you that how we can chat with your company data I want to add the link of those videos on the top right after the screen that how we can use let's say chat GPT to ask questions about your company data not the data that chat to pretty knows on Internet that has been trained until 2021 right so these are different type of scenarios that you can leverage large language models and for regardless of the use case type you always start with a design and when we talk about design designing prompts is the initial thing that you need to have in place to start asking questions from your large language model let's switch out here but that's not the and actually that's just the beginning because when I have the endpoint input which is a initial prompt I need to do some quick tests check the results from the model I'm not satisfied with the prompt as an input that I provided to the model so I'm going to modify my prompt or prompts because you have you might have multiple prompts to based on your application we're going to talk about that actually so it's going to be even more complex when you're dealing with multiple prompts to call an API to answer questions summarize something create a URL call an API search for embeddings these are different prompts that you can create and you need to evaluate them let's say you're lucky with just couple of iterations you will end up saying that oh I have some problems that I think they're good enough so you want to go further for a more major test with a couple of scenarios you want to test this from not just one or two Beyond just the trial and test first of all good luck with getting success succeeded in the trial and test here quite honest I have been another stage that will take ton of time to figure out what would be the best prompt for you if you do not follow best practices and of course some tools like prom flow we just got released second if you pass this stage and you come to evaluation stage here you need to build your custom solution to test this flow with the bulk test so for that product company use case I have to come up with multiple questions that a user might ask for a specific shoes or a specific products specific clothes and then check out what's going to be a potential answer and then figure out how my prompts or the flow of prompts are getting the output based on my requirement and my satisfied with these several scenarios not just one or two or not then that's going to be another custom thing you need to build and lastly when you're done with all these problem flow stuff that can have a python code that culture GPT model or GPD 2.5 calling all different your API calling different sources of the data that you have so all these end-to-end Solutions need to get deployed as well to create your own endpoint for this end-to-end solution this is another custom work you need to do but with the announcement of prompt flow added in Azure ml all these three stages gonna be covered for you how first of all it will let you start creating some problems not just from scratch on pre-built prompts developed by other Frameworks then it will help you have multiple versions of your prompts and compare them evaluate them to see which one works the best and then it will let you to do block test with a huge amount of data that you have you can start testing this prompt not just with one or two and see the metrics of your evaluation similar to what how we evaluate a classification model as an example right and the lastly when the solution is done we are done with the evaluation with just a quick you can deploy Your end-tense solution as an endpoint exactly how you deploy a real-time machine learning model you can have a real-time endpoint for these end-to-end Solutions so you give your API to someone else as soon as they call it that solution will go check out your database and recommend a shoe based on what your customer asks so this is fantastic now I want to add this link of article to the Discord Channel also again for enabling that new feature in Azure ml it is in private preview so if you want to try it now you have to submit a form that will let you access to not just this feature all most of other private preview features as well of azure map so I add all these links to the Discord channel so make sure you click on the Discord channel in the video description it will navigate you to the channel and you can check all the different links that is relevant to this video references icon out there so now let me jump to my Azure ml I got access to this feature I'm going to show you how it works all right here it is my favorite tool in ml Azure ml workspace and this nice feature added called promptful again as you can see as of now I'm recording its preview so as since this is a private preview based on customers feedback some potential changes may be coming into this tool but as of now I'm sharing this what is available and just got announced in Microsoft build last week Okay so when you click on Prom flow of course after enabling access and filling out the form that I mentioned just click on the print flow prompt flow under different tabs here the first one is the place that you start creating the flow of your prompts which we're going to talk about that the second tab is creating connection to your large language models let's say you have access to Azure open Ai and you have deployed chat uh GPT 3.5 model and you need to use it for your solution for let's say having a chat computation or for creating embeddings as an example for using a different model so you need to create a connection to those models how if I click on connection I just click on create there you go you can connect a model that you have in non-azure open AI the one that is publicly available you can go just opening.com login and get your API there or the one that you create in Azure Azure open Ai and of course with your prom engineering project you might need to connect these models to let's say internet we we created another video how you can connect chat GPT to internet using Bing search right so you can create a connection to your Bing API here to later use it in your project or we can connect it to the serp and I think you can hear serp for leveraging different search engines let's say Google for having that connected to your large language models now I already created a connection to my Azure open AI because I wanted to use my large language models deployed in Azure open AI so I gave it a name you need to paste your API key base type and version you can grab them from your Azure open AI I just had them all I copy and paste them here and now I have my connections here I call it test mg now runtime of course for running all these problem flow engineering running your python code your custom python code that will call open the I do comp to any sort of transformation needed within your prompt and drink project you need to have a server running right this is the place that you can create the server so you can create two type of runtimes you can create a runtime from a complete instance that you created in Azure ml so if I click on it here I give it a name mg001 and here a list of computing stats that I have created in Azure wealth before here and the one that is running is the first one so when I clicked on it it needs to install some packages and libraries to let my prompt flow and prop engineering stuff works let's say some open your packages Lang chain or kernel search whatever so you can use the default environment that has some predefined runtime with all those packages needed for prop log which I would recommend go for that or create your custom one that's it you click on create your runtime will be created here later on we want to deploy our problem during project as an endpoint then later on I can have a manage online endpoint runtime created as well I call it again another name mgp002 then the same thing the environment we talked about select your compute and you click on next review to create it I don't do it because I already did one so I have a online and runtime as well for deploying the solution that I'm going to show you shortly so that's about these three tabs we talked about the second third one but now let's go to the main one which is the flow and place that you create your prompt flow think about this with this service you can create your own company Copilot create a large language-based application that can do any task based on your development and design you can have think about like this you can have something like chat GPT to answer questions based on your company data to actually generate your code to actually generate any type of content that these large language models support based on your custom customized data set your own company requirement and your con your own company Enterprise data and of course you can go beyond just your company data you can connect to external apis you can connect to Bing search engines lots of order of capabilities or possibilities here so going to the flow let's say I want to create my first prompt flow I click on create there are a couple of things you can do here you can start creating the Flow by type here's the first type generic flow that's the time that you're gonna develop a generic prompt engineering and prompt flow you want to have your your full flexibility you can just start with this one nice start for a generic flow and you can see the description here but if you have a use case that it is towards like a chatbot application for example the use case I mentioned that a user command provides some descriptions about a specific shoes that you're gonna buy and your chatbot that is empowered by a large language model will give them some company product suggestion to me that's like a chat flow so I will start with this one and remember that an initial a very important step for us is evaluating these prompts and prom engineerings so if you don't want to start from scratch click on evaluation flow this will give you some nice Baseline to start evaluating all these prompts and prompt flows to make sure the video prompting these large language models is reliable enough to go to the next iteration or even production but here you can also create from Gallery if you don't want to start from this based on to develop yours and I actually you only want to use one of two of these examples to show you how promptler works for example if you want to start with chatting with a specific source of radar use this one this one has chat with Wikipedia so you can start modifying that or bring your own data remember we created a video that how we can chat with your own company data here is a an example of that in the gallery click on it start modifying that and running to see how it works or if you have a vector database that you have your embedded text uh added in a vector database you can start query those information and start a q a with your weight Vector database here so I actually created two one we'd bring your own data and one we chat with Wikipedia so I want to share quickly but before I I click on that there is another type of object you can create here from Gallery which is evaluation now let's say I'm done with my prompt flow I want to see how I can evaluate the prompts it is not actually an easy task because let's think about it if you have a classification model that's going to predict tomorrow going to be raining or rainy or not then the the evaluation of that model is simple how many times my model truly said that tomorrow gonna get rain and it was actually raining on that date then that's we can then start let's say measure accuracy of that model you can do the same thing for some of the your profile projects let's say if you're using chat GPT to classify a given text then you can start using this template that's how you can classify so you evaluate the accuracy of your classification by your large language model but sometimes the output of your language model is not just a classification value it's a text like how chatubity talks to you right then how you can evaluate the performance of a model like chat GPT application that can retrieve you back a text so on backhand how Azure ml team developed this evaluation for those even chat based scenarios is on backend is using another model like chair rgbt as an example it's a it's a large language model like GPT 3.5 or Jupiter I think it's using actual gpt4 to check how was your large language model performance or output based on the given prompt and what was the this Zoe or ground truth sort of chat or output then it will start measure the similarity of what you expected and what the model actually gave you as an output it is scored from one to five so you can see that it is using large language model by itself to evaluate Your solution that is using another large language model I know it's it's sort of a nested solution but this is how it works and I think it's a brilliant idea so this one I mentioned that classification is simple but if you talk about let's say a similarity evaluation complete the similarity of the answer base this is your ground truth and based on and compared with what your large language model let's say GPT for actually answered based on your question that means you have to have your ground through Theta to evaluate your problem so going back to the flow I actually ran one of these evaluation I'm going to show you shortly I also for flow I started with bring your own data if I click on this one it is telling me that it will start getting the answer question from the user search your vector database figure out what is the relevant source of data for the user question bring it to the prompt and then answer the question this is the flow okay so I can now create my own Vector index to start chatting with my own data but if you just click on clone it will start with the example of a public data set that will start asking questions from that Vector data or index data and answer the question so if I click on clone or if you click on clone you will see that chat with your data will be appeared here so if I click on it there you go this is the nice playground of promptler I know there are too many things inside but let me simplify the process for you now the first thing that you're gonna ask from the model or the first thing that you provide as an input is here so if I click on input the input is simply a question right how do you use how to use SDK V2 of azurement this is an example of a question now this question should we should create embedding out of these questions to go ahead all the way to the vector and check what is the similar source of data based on the question or longer story short I want to see what source of data that I have is the closest one to this question to bring the answer for it that's why I have my input here first and second if I click on it I'm going to actually embed this question how this is the time we need a large language model and that's why the type here there's an icon that shows this is a large language model based task how it got created if you click on llm there you go use opening a large language model if I click on it you just started to create another I don't know what I want for me this is how this one got created so I'm going to delete this one to know that make this stuff messy yeah still this step there you go I don't want this one okay now moving all the way back to the top we talked about the first step now second is creating embeddings how remember I told you need to create a connection to your opening models I created one I showed you how and I call it test mg that's why I needed that so it knows that there is my Azure open AI series API token and stuff so it can connect to that and I test that out to make sure it's working I clicked on one and it was able to create the embedding out of my input which is the user question what is the next step here now I need to search question from index documents now I have embeddings of the user question I should go to the source of data which is index to bring the relevant data back to the prom and start answering the question so how this step got created it is not llm you are right it is actually sort of a vector index so that's why if I click on more tools you see there's something called Vector index lookup if I click on it there you go something similar that happened if you don't know what it is just stay here search text or vector basically from Azure Mill Azure ml Vector next or you can use Azure translator to translate a given text here or you can use phase index I think this is developed by Facebook that's another similarity search method right you can have that as another step here or content safety Vector database lookup search for your vector database or existing Vector database to answer that question these are different all type of steps or just create a vanilla python code so create an click on Python and have a python step here which we have it actually going to show you so let me delete this extra one going all the way back now we had embedding created now we have um sorry let me click on this one now we have search question from index docs and then the output of that will go to a python code this got generated by this step and let's see what this one does you can see this one actually grabbed the search results how the search results are actually coming from the output of the previous step so we got the source data index data coming in as an output as an input to this one that's why I have a specified it here so it will grab it and all what this code does it is just simply create what I need to add in my prompt to ask your activity later what this is the question asked and this is the source of data I grabbed it from the vector vectors and index data now I'm going to use these as an input for the next part which is the prompt part that I'm going to ask my large language model so what is the prompt here the prompt is now the problem here is you are an AI assistant that helps user answer the question and you should give a context double blah and this is the source and there you go this is the context and this is the question but where these are coming from look at that my context is coming from this python code that I grabbed and question is actually coming from the input there you go so you can see even from this nice visualized part two inputs coming into this prompt variant uh the question that I have and the The Source data that I got from this index Vector here there you go these two are coming to this and it calls prompt variance that means I can if I'm not happy with this prompt I can start actually generating more variants of the prom I can see I just have one or one variant of that prompt but I can create my second model here first starting from zero or the second one as well then I can evaluate the performance all these prompts for the same task to see which one gonna be the best one for me so far just let's go with this default one and then lastly I'm ready to ask my large language model here let's say GPT 3.5 which is the one that Chad GPT use to answer the question based on user question right there you go I just get the output which is my prompt here the output of the previous step as an input to call my open Ai and then I generate the output back which is the customer or the user question before I actually run this let me go to show you another example chat with Wikipedia again I created we just clicking on chat with Wikipedia if I open this one to just understand prompt flow a little bit better let me see if I have this one actually I fixed or I want this one actually worked properly there you go okay so here what I'm gonna do I'm gonna start asking question and get the answer from live Wikipedia information even the information available in Wikipedia after 2021 which is the time that chat triputi for example got trained right so live data happening right now in Wikipedia I want to get the answer from there fit the source URL so what I'm gonna do now this time I'm gonna be a little bit faster because you know how to create these steps these are got created by here this is gonna generate a python code this kind of Journey to prompt is going to call an open EI model or large language model and different type of tools are available here so here the input is again a question what did opener announce on March 2023 again this is not something that chat to be for example will know by itself so if we're going to connect it to Wikipedia and get the answer from there how this is actually how we're going to do let's check out the first step I click on here this is the initial prompt that you are an AI assistant you want to do this and that here's a couple of examples the user might ask something and here's the output user gonna have something here's the output so it I'm showing some examples how you're gonna chat with Wikipedia data okay now then having that in place this is sort of a chat bot-based scenario you can see that we are giving the history of all the chat to this large language model and then the last question and answer and here is the last question of the human or the user now the next is the answer of the last question this is exactly how chatbots actually work okay but we are not there yet of course for answering the user question we need to connect to Wikipedia sort of grab the relevant information generate the URL and talk about the source of data that answer the question of the user that's why we have multiple python or custom python steps here not going through all the details here is an example but just simply it is getting the wiki URL using this code and then search the results based on the generated URL here and then process the search result in a way that is ready for actually answer user question read another prompt here what I gonna do so actually before I go further you will see that why we are having two different prompts that call large language model well if the first one you're trying to understand that the intention of the what is the context of a user question and based on that which is an output of an open UI model we will generate the URL and search results from Wikipedia and then what's going to be the last large language model connection do this is the one actually answering the question it gets the context and the URL and the user chat history and then this will give us the answer now you can test each of these steps manually one by one to run them I did it successfully or here you can do a bulk test I mentioned that you need to have your ground playable data or just click on chat to test it on fly so let me click on it you can see that this was the default question what did open your announce it answered that me properly and then the second example I ran it what's the difference between this model and previous ones it got the answer plus the source now I'm happy with this chat with Wikipedia that can be chat with your company Wikipedia chat with your internal website use these templates and check out the chatbot here now you want to make this deploy it you want to make this chatbot as an API or an endpoint that you can use it somewhere else not just testing here in Azure ml then just simply click on deploy that's it it is as simple as deploying a machine learning model similar to that you give it a name you define your compute click on next next next with just couple of clicks you can deploy this end-to-end solution I did it already if I go to my end points you will see that the latest one I created here says that it is healthy and I can even test it on fly and it will give me some also examples that how can I consume this endpoint that is a prompt flow project not just a simple machine learning project so there are much more stuff quite honest I think I'm recording this just at the very pop for almost 30 minutes and this is something new I'm personally also experimenting it and it has more stuff coming in some of them I didn't got even access to that for example you can do monitoring of your prompt engineering projects let's say now my prompts everything I evaluated today works properly but there's no guarantee that tomorrow or a week after your prompts is not going to be performing as you expect so there's a monitoring service that will start monitoring the performance of your prompt flow that uses large language models to tell you that hey this is the time to revise your prompts or revise your solution right so apply for this private pre-feature if you are watching a video in a time that is not again previewed then give it a try test it out for each specific use case I mentioned Dive Right of different type of use cases we can check further or you can create a generic one we didn't have a chance to Deep dive further on the bulk test stuff the monitoring your stuff but I don't think we can feed them all to one video and I don't want to make it overwhelming I think it already has a lot of information that we can digest so this was sort of an introduction to this art of possibility I know I quickly jumped through different components without much more Deep dive details but please please make sure you write down in the comment section what else you're interested to know about this capability so I got to make sure I'll support you with generating more specific content underneath of this prompt flow feature that got added and stay tuned more features more capabilities coming in and as soon as they're up there and I personally got access as a regular user I will make sure I post them online so you can give it a try and check that out I hope you enjoyed this video that's all how do you start from nothing number one make friends that they are hunger and smart like you these friends will rise over years and don't forget loyalty is power and power is loyalty number two now choose friends that they are more powerful than you and serve them you help people they will help you back [Music] number three focus on your very initial small success which means you have more resources to develop and create bigger success over and over again and lastly when you make decisions use both your brain for reason and your soul for intuition dream big my friends believe in yourself and take action until next video take care [Music]
Info
Channel: MG
Views: 9,024
Rating: undefined out of 5
Keywords: AzureML, PromptEngineering, Azure, MachineLearning, LargeLanguageModels, AI, DataScience, DeepLearning, AIApplications, PromptFlow, AzureMLPromptFlow, AIRevolution, ArtificialIntelligence, AzureCloud, CloudComputing, DataAnalytics, MLModeling, MLDeployment, DataEngineering, Python, MLLifecycle, AIOptimization, NLP, MLAlgorithms, AutomatedML, Chatbots, MachineLearningTools, AIModels, LLMApps, AzureInfrastructure, MLPlatform, MG Azure, MG Cloud, MG AI, MG ML, mg ml, prompt engineering, chat gpt, coding prompts chatgpt
Id: psyoL-nikWM
Channel Id: undefined
Length: 34min 27sec (2067 seconds)
Published: Tue May 30 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.