4 Autonomous AI Agents: “Westworld” simulation, Camel, BabyAGI, AutoGPT, Camel ⭐ LangChain ⭐

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

autonomous AI agents have been the hardest topic it is truly impressive how rapidly things have progressed and unfolded in this area even AI experts including Andre caparthy referred to moto gbts as the next Frontier of prompt engineering I think so as well what do you think there are at least four notable autonomous AI agents projects that just came out in the last two weeks in this video we're going to dive into each of them first up we have the Westworld simulation the original paper's name is generative agents but people have been calling it mini Westworld or Westworld simulations so researchers from Google in Stanford created this sandbox of 25 AI agents that can interact with each other and simulate human behavior you can click on the agent and to see where they are and what they're doing right now this agent Tamara is doing certain actions let's pause it so again we can see exactly what she's doing she's having a car conversation and discuss their creative projects local politics it looks really real and the current conversation looks really real as well let's go to the original paper of the generative agents as you can see here the agents can take a walk at the park share news with colleagues arrive at schools they can simulate believable human behavior not just fake behaviors but those behaviors are actually believable what's more impressive is that they can even plan a Valentine's Day's party with everyone together here you can see the description of the Valentine's Day's party starting with only a single user specified notion that one agent wants to throw a Valentine's Day's party for the invitations to the party over the next two days ask each other and on dates to the party in coronates the party this is really impressive so how is this all possible so this is all possible because of the generative agent architecture there are three most important components in this architecture with memory reflection and planning the memory stream contains a list of observations for each agent at each timestamp so you can see here is just a log of all the observations whether it is behaviors that's originated by the agent or the behaviors that the agent perceives from others as you can see the memory stream here can be really really long because it's just locks we need to figure out which memory is kind of important just think about it as human beings we have a lot of memories in our long-term memory system but we can't recall all the memories right we can only remember the important ones or the most recent ones that's exactly how this memory system is working in terms of retrieval for each observation there is a retrieval score which is a combination of recency importance and relevance recently tells us that the most recent memories are more important and then for importance is the memories that the agent believes to be more important to have a higher score for example if you break up with someone the memory should be more important than having breakfast this morning and the relevance is how relevant this memory is to the current situation or the current query so this way you can retrieve your memory based on the retriever score and only retrieve a subset of the memory with higher retrieval scores the second important component is called reflection when you are only looking at Raw observations it's really hard to generalize and make inference as you can see here we have a bunch of observations and those can contribute to different reflections over here Reflections are higher level more abstract thoughts generated by agent it is generated periodically not all the time of course it's using basically I think just two questions given on the information above what are three most Salient high level questions you can answer about the subjects in the statement and the second question what five high-level insights can you infer from the past statements so that's the reflection high level abstract thoughts by the agent the third important component as we said before is planning planning is important because you don't want your agent to do the same thing over and over again for example Claus would launch at 12 but then again at 12 30 and 1 pm with planning you're planning for longer period of time so that your sequence of actions can be coherent and believable so planning is essential and now of course you can react and update your plans according to other observations in the memory system so yeah so that's the Westworld simulation we have observations in the memory system retrieve relevant memory and then we act based on the relevant memory here and also you reflect you make ends plans in reflection also fitting to the memory system it's like a from that Loop over and over again the possibilities for applications for this is kind of huge this is kind of a big deal and maybe even a little scary just imagine if there is a agent that watches and observes your movie every day you can make plans for you and even perhaps execute plans for you for example the agent could know when you would get up and hope you make your first cup of coffee so this is like so scary and so useful at the same time Lane Chang is actually actively working on this right now so hopefully in the next week we'll we'll have something to play with in Nightshade I'm looking forward to that yeah so that's the first project I think you should know the second project I want to talk about is called camel communicative agents for mind exploring of large-scale language models Society similar to the generative AI we just talked about this one also uses more than one AI agent where the two AI agents can communicate and interact with each other camel proposed a role-playing framework with two agents where we have a AI user agent and a AI assisting agent where AI user agent gives instructions for completing a specific task and the AI assistant agent provides solutions for the task there's also another agent called task specifier to brainstorm specific tasks for AI user and AI assistant to complete as an example a human user has an idea of developing a trading bot for their stock market then the task specifier comes up with a specific task is going to develop a trading bot that can monitor social media platforms and execute trades based on sentimental analysis results so this agent is actually really helpful because you don't really need to come up with specific ideas and steps to feeding to the language model yaninita idea and this agent will generate specific tasks for you and then with the specific task in mind AI user actually becomes the task planner and the AI assistant agent becomes the task executor they actually prompt each other in a loop until some termination conditions are met the essence of camel is actually the prompt engineering they call the prompt Inception prompting here are specific prompts that camo is using for the assistant and for the user the wording has its specific uses for example never flip roles is preventing Agents from flipping roles and then there are some wording prohibit the agent from producing harmful or misleading information and then there are some prompt to make sure the assistant always responds in a consistent formatting and then your solution always ends with the next request so that the conversation can keep going let's take a look at the lane chain implementation of camel which was just released this week so first of all we install native packages line chain and open AI we import the needed packages by the way I will link all the collab notebooks that I use today in the description below if you need it if you can take a look we Define a camel agent helper class okay so this is the thing we actually need to set up ourselves uh when you set up open AI API key so type your keys here you need to give the assistant agent a real name here in the example we have python programmer you need to give the user agent a real name stock Trader in this example and then you need to just specify a task the example is develop a trading bot for the stock market so then you need to create a task specified agent for brainstorming and getting the specific task here is the task specifier prompt here is a task that our assistant will help the user to complete and the task is actually develop a trading bot for the stock market yeah so you actually don't need to change anything here it's ready to go and then we have the Inception prompts for AI assistant and AI user for role playing and those wording are just the same as the wording in the article I just showed you more helper functions and now we can create a assistant agent and a AI user agent and then finally you can start role playing session to solve the task as you can see it's basically just a while loop right if you set the chat turn limit which is like the number of conversations uh really high it can go on forever but I just set it as five because I don't want to spend too much money on it and also by the way I just learned that GPT 3.5 turbo is cheaper than gpd3 so yeah keep that in mind when you set this up okay let's take a look at the result the original test prompt is we are developing a trading bot for the stock market and now we have a specified it has prompt create a python Based training bot using real-time Market data analyze historical data Trends and implementing Rex manage strategies okay it's a little different from the example but also I think it makes it makes sense the AI user agent which is the stock Trader gives an instruction install the necessary packages for the trading bot the AI assistant then provide a solution to install some packages create a requirements.txt file listed and the list all the requirement packages um which makes sense but it didn't say which packages are required in the requirements.txt file so it's not perfect actually and then the AI user gives another extraction of connect to a real-time Market data source the AI system then connect then went to this website to get the data is this a real website okay this is not sure if it's real or not maybe it is just need API key and then we have Implement a function to track portfolio performance by calculating the total value of all stock currently held and now we have a track portfolio function so one issue about this is that you can see all the responses are from the large language model it's actually not executing the python code it's not using any tools it'll be nice if we're not only generating the python code AS text but also we can actually execute around the python code and if it doesn't run then we debug and keep trying until it works that'll be very nice so I'm hoping line chain will integrate all the amazing tools to use with camo then it will definitely give camel more power to connect it with different tools there are some interesting use cases that I saw on the internet this is I think the developer of camel so yeah so he actually asked camo to take control of the world AI user and yes assistant was able to like make a plan to infiltrate the communication networks take over the financial infrastructure political system defend agis operations in space from operational threats yeah so there could be some really scary stuff out there potentially anyways this is camel I actually really like this one because it's so fun it's just two agents talking with each other yeah that's the second project the third project is baby AGI you probably already heard about it or have even tried it uh I just love the name baby HEI so much it's so cute I think it's a really good choice of the name this is the baby AGI documentation it's actually written by gpd4 it's very interesting unlike the previous two projects where we have multiple agents interact and communicate with each other baby IGI only has one AI agent with three task agents so this one AI agent doesn't communicate with other agents but instead the task agents they are kind of work together in a hoop to create tasks in a to-do list to execute tasks and reorder the tasks the author yohei mentioned in a webinar that baby AGI works exactly how he works actually he will start each morning by working on the first item on his to-do list when there are new items or new tasks arise he would add those to his to-do list and then by the end of the day he would re-evaluate and re-prioritize the tasks in the to-do list so he then mapped the same approach to the task agents or babe AGI so basically we have a task queue the user will provide the objective and the task the task execution agent will complete the first item of the task queue the task creation agent receives the test result and also have the context from the memory and then create a new task and finally task prioritization agent re-prioritize the task and return the claimed task list and then this process just gets repeated over and over again let's take a look at the lane chain implementation we install needed packages and then we import needed packages you need to type in your open AI API key here connect to the vector store here we're using FiOS but there are other options you don't have to use this one you can choose other Vector stores you like now the important parts are the three llm chains the first chain is the task creation chain you can see the prompt here you are a task creation agent that use uses the result over an execution agent uh with the following objective you see the result of the previous completed task and now based on the result create new tasks to be completed for the task prioritization chain you would cling the formatting of and and prioritizing the following tasks return the result as a numbered list and then the execution chain would perform one task based on the following objective take into account previous completed tasks and execute your given task so those are the three task chains and now to Define baby AGI controller we have a while loop here looks familiar we pull the first task execute the task so the result actually we were using bias so that should be bias and step four is to create new tasks and re-prioritize task list so it will stop if we give it a Max iteration number and if the number of iterations is the maximum of the iteration that we set here's an example write a weather report for San Francisco today our first task is to make a to-do list make sense and now we execute this task the result is a to-do list as we expected check the current temperature in San Francisco check the forecast of the day which all makes a lot of sense now we get a new task on the top check the current temperature in San Francisco now the current temperature is 57 degrees Fahrenheit this is actually not correct by the way because this is just a language model it doesn't actually know the real temperature right now and also because I think because I only ran three iterations it didn't actually get to the right the report part but that's the basic idea you can see it's actually important to include different tools to work with baby AGI together so that the information can be more accurate in this next example I tried a baby GI with light chain tools it's actually integrated with launching tools already when you do pip install Google search results so that I can do Google Search we need to provide the server API key again to do Google search there are two code chunks that are different the first one is to Define tools we use here we defined a Search tool to answer questions from Google search another tool is the to do chain which is useful for making a to-do list I guess when we Define that baby AJ controller we need to define the tools in our agent here's the same writer weather report for San Francisco today as you can see it uses Google search to search local weather stations current temperature in San Francisco and he was able to get the current temperature this one is interesting it came up with a new task to compare current weather condition to historical data we didn't ask for historical data at all but it was able to come up with this comparison because weather report would be nice to have some comparisons the next one is a little weird I think it has a good idea but we didn't give baby AGI enough tools to to make this visual representation it kind of didn't go well so on the line chain website actually gave us a little better result you can see it's actually kind of pretty comprehensive and nice there are actually a lot of real world use case on this already which is quite impressive this project that can write code and execute code and here we have agent gbt which is also very popular it's Auto gbt in the browser baby AGI BT plugin and we have an example of baby AGI in slack this is a baby AGI on Discord scientific literature reviews yeah so that's baby ATI and the final project we have today is auto GPT Auto GPT is a lot like baby AGI but with its own framework and its own set of tools it's also an infinite Loop of generating thoughts reasoning criticizing planning the next action and executing one thing special about Auto GPT that is that it it comes with a set of tools and it's not in the launching framework yet but hopefully soon it will be in line chain so I want to show you the tools it has you can run many different commands using Auto GPT start agent message agent list agent okay text summary read file write to file I'm not sure if Lane chain has the tools to read and write two files and also search files browse website evaluate code write tests yeah so this is quite nice and when you look at exactly what's under the hood here this goes to another function so this is basically a function of how to get result from Google Search and those are all functions that you can craft and adapt to your own use cases which is I think provide a lot more flexibility for you if you would like to use it line chain is actually actually working on including Auto GPT to launching so hopefully soon we can use it in line chain as well but now let's just take a look at the code from the GitHub repo first of all we need to get clone the repo and we direct to the auto GPT directory you're going to see what is in the directory pipp install required packages so you all the packages are listed in requirements.txt so you just do pip install requirements.txt there is a file called dot m.tamplit where it listed all the API keys and so in the instruction is to rename this to dot in and fill in your open eye API key to what I did is uh just like what we did before we set the open AI API key in the environment directly and then to run auto GPT we'll just do python Auto gbt I can use the flag gpt3 only if you don't have access to gpd4 okay let's let's run it and see what happens okay so with auto GPT you will give your AI a name or just try the original one and then you will describe your ai's role for for example AI designed to develop and run business with the sole of increasing your net worth let's just try that now you can see it has the thoughts I need to determine what business moves to make to increase my net worth and it has some reasoning browsing force can provide insights and then it has a bunch of plans browse Forbes website determine if any of those things Industries or companies are capable of being replicated and then we have criticism and we get lost browsing and lost track of time so we need to keep track of my time so the next action is actually browsing the Forbes website with a question what industries and companies are currently profitable and now fail to parse not sure what's happened but then let's just enter y see what happens it was actually the next action was to start an agent to do a web scripting for website and then he was like I'm not authorized to perform web scripting instead I will use Google to search for articles now the plan becomes search Google for currently profitable Industries or companies now the next action becomes Google search I really like Auto GPT that it gives human a chance to say yes or no to proceed now I can say yes to authorize the command and now it's doing Google search now the plan is to use use a gbt agent to read the information from the top search result the next action is to start an agent which is a industry analyzer to read and analyze information okay let's say yes and see what happens Auto GPT has identified for profitable Industries I'm not sure if this is the best approach to find the industry first it's kind of weird yeah so that's the basic idea of Auto gbt you can have this conversation with Auto gbt forever you will just keep going and you can keep spending your openai API money I want to show you some real world use cases for auto GPT as well here's the first example is using Auto gbt to write its own code and execute python script which I think is really cool this one is doing a market research browse recent events and create podcast online book flights and ordering food here is an example of using Auto gbt on your iPhone also LGBT has a lot of use cases already also it's quite incredible yeah so that's it for this video well we talk about four autonomous AI agents projects despite being in their early stage of development they have already showed impressive functionalities and potential applications and I'm very excited to see what's going to happen next if you find this video helpful uh thank you and see you next time bye

Info

Channel: Sophia Yang

Views: 7,102

Rating: undefined out of 5

Keywords:

Id: yWbnH6inT_U

Channel Id: undefined

Length: 26min 17sec (1577 seconds)

Published: Sun Apr 16 2023