AutoGen Tutorial 🚀 Create Custom AI Agents EASILY (Incredible)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
autogen is here and it is incredible autogen is a project by Microsoft that allows you to create as many autonomous agents as you want and have them work together to do things and it's a framework so it's incredibly flexible you can Define all the different agents that you want the roles you want them to embody and then just send them off working together and of course you can work with them so today I'm going to tell you a little bit about it then I'm going to show you how to install it then I'm going to show you how to use it let's go so this is the blog post announcing autogen by Microsoft autogen enabling Next Generation large language model applications so autogen allows you to do what a lot of projects are already doing manually which is have multiple agents working together to accomplish a task and what I've seen and what everyone has really seen is that when you have multiple AI agents working together the output tends to be much better whether you're talking about coding or whether you're talking about planning creative writing everything because you have one agent that is doing the work another agent that can check the work another that can provide feedback and they go through this iterative Loop where the final output just becomes so much better and autogen can be dropped into any project and it also is a dropin replacement for open AI API so you can simply drop in multi-agent support without changing your code if you have an existing code base using open AI API and so I want to read this part here because it really summarizes what autogen does really well autogen is a frame work for simplifying the orchestration optimization and automation of llm workflows it offers customizable and conversible agents that leverage the strongest capabilities of the most advanced llms like gp4 while addressing their limitations by integrating with humans and tools and having conversations between multiple agents via automated chat now the nice thing is it doesn't completely depend on open ai's API and obviously it's super easy to use with it but technically you can drop in any any large language model with this framework as long as it has an API so here it says with autogen building a complex multi-agent conversation system boils down to defining a set of Agents with specialized capabilities and roles so for example if we're building an engineering team out of AI agents we might have one that's an engineer another that's a project manager another that's quality assurance and so on and then you define the interaction Behavior between agents I.E what to reply when an agent receives a message from another agent so not only are you defining the agents in roles but you're actually defining how they work together and I'm just going to turn off my dark reader just so I can show you this chart so in this example that they give they show the two default agents that auto gem comes with number one is a user proxy agent and that is essentially an AI agent that works on behalf of the user so it can make decisions and it can also ask the user the actual human for input but it can also do things on its own so for example if one of the engineering agents creates some code it'll send it to the user proxy agent who can then either send it to the human for approval to run the code or it can Auto approve and run the code itself then we also have this assistant agent which is just kind of the most default version of an agent that you can have and the assistant agent is the one that is actually going to be writing the code and the example that we just discussed so here it says a user proxy agent and assistant agent from autogen can be used to build an enhanced version of chat GPT Plus Code interpreter plus plugins the assistant agent plays the role of an AI assistant like Bing chat or chat GPT the user proxy agent plays the role of the user and simulates user's Behavior such as code execution so if that user proxy agent weren't there the assistant agent would pass me the code and then I would open up visual studio code for example and just run the code but now that we have that user proxy agent the user proxy agent can actually run the code itself saving me time autogen automates the chat between two agents while while allowing human feedback for or intervention the user proxy seamlessly engages humans and uses tools when appropriate so another thing it can do is ask me for input so as all of these autonomous agents are working together at any point I can say okay this is the point in which you want to ask the user the human for input as part of this process now on the GitHub page for autogen they give a bunch of different examples and the code for them using Google collab and so we're going to review a couple of these as the way to teach ourselves how to use it so the first one says automated task solving with code generation execution and debugging that sounds like a great one let's try that one okay so I have my Google collab loaded up this is the templated example that Microsoft provides I'll drop a link in the description of all of these examples that we go over today so it says in this notebook we demonstrate how to use assistant agent and user proxy agent to write code and execute the code here assistant agent is an llm based agent that can write python code in a python coding block BL for a user to execute for a given task so this is essentially code interpreter but on steroids and if you remember as you're using Code interpreter if it runs into any problems for example if it writes some code and it gets an error or a bug it can iteratively fix itself and that's the beauty of the multi-agent framework user proxy agent is an agent which serves as a proxy for the human user to execute the code written by assistant agent or automatically execute the code so again usually if there were code output I would have to copy the code put it in Visual Studio code and then execute it but in this example user proxy agent can actually run the code for me and it can either ask for my approval to run the code or it can just run the code automatically and there's a setting for that which I'll show you depending on the setting of human input mode and that's where we set whether you need to actually give approval or not and Max consecutive auto reply the user proxy agent either solicits feedback from the human user or returns Auto feedback based on the result of code execution success or failures and corresponding outputs then assistant agent will debug the code and suggest new code if the result contains error the two agents keep communicating with each other until the task is done so I'm going to be showing you all of this through Google collab but you can easily run this locally as well the first step is right here we're just going to install the pi autogen Library so pip install Pi autogen so I just removed the commenting and then I installed Pi Auto genen next we're going to set our API endpoint and this is where you can plug in other API you want to see me create another video where I show you how to use llu specifically with autogen let me know in the comments so here we just import the library import autogen and then we're going to have a config list and so we say autogen config list from Json and then we pass in a Json object oi config list is just the name and then we have the filters so we're saying the model can be any of these models and here's what the config list looks like and you can have multiple API endpoints so you can actually use multiple models it seems so here we have model GPT 4 and we're going to input our API key right here so I go to open AI again and I go to the API key section and if you don't already have an open AI account go ahead and sign up click create new key and I'm going to type in autogen as the name and create new key and don't worry I am going to revoke this API key before publishing the video so I copy it and one quick change that I made is adding this config list this Json object into this code block right here because I realized that it can't actually read from this this is just text now I run this and it's going to import autogen and then it's going to set up the configuration for me okay should be done and so here it is example task check stock price change and so what this is going to need to do is actually write code to go check the stock prices and then present it to me the user so here's what we're going to be defining our different agents the first agent is the standard agent the assistant and so we say assistant agent that's one of the standard agents autogen do assistant agent name assistant llm config and this is where we actually write some configuration about our llm we use seed 42 and seed is for caching so it actually tries to Cache some of these prompts that it's going to be using so if we have 42 over and over again it will not actually call the llm again and again it'll just use the cach version then we pass in the config list and then the config list reads from the config file and then we have the temperature and of course as always the higher the temperature the more creative the result and the lower the temperature the more standardized or less unique the result is going to be then we also have the user proxy agent and as a reminder the user proxy agent represents the user so represents me the human user but it can also do things on its own so here we call it user proxy the human input mode is never and so that means it's just going to execute code as it gets it it's not going to ask me for approval and so really quickly here it says the human input mode options are always so basically at every single step ask for input terminate which means only during termination does it ask for input only when it's done and then never which means don't ever ask for input from the user Max consecutive auto reply now is set at 10 and that essentially means the number of back and forth that we're going to allow before the task terminates so 10 is fine obviously the larger it is the more you risk the agents going back and forth without any human input then here we say what the is termination messages and this is just really the logic to understand when the termination actually occurs when the task has been completed then we have the code execution config uh so we have the working directory coding then we say use Docker false so it can use Docker and then here's where we actually kick off the dialogue so it says user proxy and it always starts with user proxy initiate chat and we're going to initiate chat with an assistant with the message what date is today compare the year-to-date gain for meta and Tesla and so that is the task that we're giving it and keep in mind you can Define as many agents as you want to work together and you can give them all individual unique roles so now we run it and there it is it's going what date is today this is the user proxy this is me asking it this is if I were to prompt the large language model what date is today compare the year-to-date gain for meta and Tesla so it is figuring out how to do that most likely it's going to have to search the web for that information of what the stock prices are and then it's going to maybe put together a graph or just tell me all right here it goes so the assistant to the user proxy remember this is a conversation first let's get the current date using python so it writes some python code to get the date and today equals date. today that's fine and it prints the date next we need to get the year-to-date gain for meta and Tesla we can use the Y Finance library in Python to get the stock prices please install the library if you haven't done so by running pip install y Finance in your terminal so this is the assistant telling me to install that Library however the user agent is going to jump in and do that for me and it actually gives me the code here is the python code to get the year-to DAT gain for meta and Tesla so it writes the code import y Finance from date time import date time get the current year there it's using that Library so it actually wrote the code and here it says this code will print the year-to-date gain for meta and Tesla in percent AG now here's where the execution actually occurs so this is the user proxy aka me but an agent on my behalf executing the code and we got a bug so here we have a bug and we pass it back see it says user proxy to the assistant we pass that issue back to the assistant to try to fix it and here it is the assistant back to the user I apologize for the oversight the variable today was defined in the first code block but not in the second one let's correct it so there it is it is actually correcting it here we go here's the new code and here's the code output so it looks like it actually worked this time there it is meta year-to-date gain Tesla year-to-date gain in percentage and then the assistant back to the user says great the code has been executed successfully and then the assistant basically packages up the information in a really nice readable way so as of October 2nd 2023 that is today the year-to date gain for meta is approximately 140% and the year-to date gain for Tesla is 131% this means if you had invested in these stocks at the beginning of the year your investment in meta would have increased by about 140% and your investment in Tesla would have increased by 131% please note that stock prices can fluctuate and the actual gains may vary terminate now the assistant knew it completed it successfully and then it output the terminate response which stops the conversation completely this is incredible it's a very simple example but it really shows you the power of autojet let's let's say we want to actually plot a chart so let's continue we have user proxy sending a message the recipient is going to be the assistant plot a chart of their stock price change year-to date and save to stock price yt.png let's run that we're just continuing the conversation we don't have to restart the entire prompt so first the user proxy me to assistant and then it's saying what this prompt is okay let's take a look at what the output is so assistant to the user sure we can use matplot lib library in python to plot the chart if you haven't installed it already you can install it by running pip install Matt plot lib we don't have to do that the user proxy agent is going to do that on our behalf and the assistant writes the code for us so here we go we're going to import it it also tells us how to install it and then it shows how to actually output the data in a chart format and we have an error so metadata normalized name metadata is not defined so the user is going to pass the error to the assistant then the assistant back to the user says I apologize for the oversight the variables metadata and Tesla data were defined in the previous code block but not in this one let's correct it and here we go we have updated code now then it runs it again user proxy 2 assistant it actually completed it that time excellent okay now we have stock price yt.png and now let's display the generated file let's go ahead and do it so I just ran this this is fresh from the code that I just ran and it worked this is absolutely incredible how amazing is that now anybody can create a multi-agent environment so easily I want to show one more example because this one is so cool this next one is autogenerated Agent chat teaching this notebook demonstrates how autogen enables a user to teach AI new skills via natural agent interactions without requiring knowledge of a programming language so basically what we're going to do is we're going to create new skills for these agents that they can use over and over again so as always we're going to start with Pip install Pi autogen so right here I'm going to go ahead remove the comment and then run it all right there we're done now we're going to set our API endpoint we're going to import autogen we're going to set up the llm config all right so I just changed a couple things in here we're going to do it simply instead of having a separate file I'm just going to put this config list right here then we load the config list into this llm config hit play then let's run this block done example task literature survey we consider a scenario where one needs to find research papers of a certain topic categorize the application domains and plot a bar chart of the number of papers in each domain so first we're going to construct our agents here we have an assistant agent and we have the user proxy agent just like normal and there we go we set them up and let's run this block done now we do step-by-step requests so here we say find arxiv papers that show how are people studying trust calibration and aib based systems and there we go now of course you can Define all of this in separate files and then load the files but we're going to do it simply by just including it in the code and here we go find arxiv papers that show how are people studying trust calibration and AI based systems and the user is initiating the chat and now the assistant is probably working with that here we go to find AR XIV papers related to trust calibration in aib based systems we can use the arxiv API so it has no information about what the arxiv API actually looks like but it's going to assume some stuff and here we go so it writes some code to do that here it is the export right there and it's basically just going to scrape that it's going to tell me everything I need to do then the user proxy is going to actually execute the code and here we go it looks like it's actually grabbing some papers this is absolutely so cool and here we grab a bunch of different papers the python script has successfully fetched the information about AR XIV papers related to to tr trust calibration in aib base systems here are the titles of the paper their authors a brief summary and a link to the full paper so it looks like we got a list of 10 and we can actually click into each of them if we wanted to then we got the terminate command so so far what we have is a list then we continue the task task two analyze the above results to list the application domain studied by these papers let's do that so it's probably going to actually need to read the papers so let's see how it decides to do that okay here we go assistant based on the summaries of the papers here are the application domains that these papers have studied I'm not actually sure how it analyze the papers I would think it would have to write some code to actually go analyze the papers but maybe it just read the summaries okay so now we have the summaries then task three use this data to generate a bar chart of domains and number of papers in that domain and save to a file do that now now the assistant says to generate a bar chart of the application domains and the number of papers in each domain we can use matplot lib library and python here's the script so it provides us the script perfect then the user proxy executes the code and it looks like it executed it successfully on the first try and terminate the python script has successfully generated a bar chart showing the number there we go now let's uncomment this and let's look at the bar chart okay so it looks like it named it something different so I'm going to grab that name and I'm going to put it right here and let's play it again there we go unbelievable that was easy okay now we're done next we're actually going to create a recipe basically a learning a piece of knowledge that can be used over and over again to do something very similar now that the task has finished via a number of interactions the user does not want to repeat these many steps in the future what can the user do a follow-up request can be made to create a reusable recipe so task four reflect on the sequence and create a recipe containing all the steps necessary and a name for it suggest well documented generalized python functions to perform similar tasks for the coding steps in the future make sure the coding steps and non-coding steps are never mixed in one function all right let's see if it works I mean it's so cool It's So Meta not Facebook meta but actually meta and once we have this we could technically store it locally and we're caching some of it so this actually becomes extremely efficient to run okay here we go sure let's create a recipe for the sequence of steps as followed and it lists out the sequence of steps and it returns all of this code beautiful great it seems like python functions for fetching papers from arxiv and generating a bar chart have been defined successfully you can use these functions in the future to perform similar tasks remember to perform the non-coding steps manually in your language okay terminate done now we can reuse the recipes if we wanted to but I'm going to go ahead and skip that step so that's it if your mind isn't blown by this I don't know what's going to blow your mind this is absolutely incredible and I'm just scratching the surface let me know in the comments if you have use cases that you want me to explore with this that I didn't touch on today if you liked this video please consider giving a like And subscribe and I'll see you in the next one
Info
Channel: Matthew Berman
Views: 235,789
Rating: undefined out of 5
Keywords: autogen, auto gen, autonomous agents, microsoft ai, ai, artificial intelligence, chatgpt, chat gpt, open ai, openai, multi agent, llm, large language model
Id: vU2S6dVf79M
Channel Id: undefined
Length: 20min 10sec (1210 seconds)
Published: Tue Oct 03 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.