How To Use AutoGen With ANY Open-Source LLM for FREE

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
In this video, I'll demonstrate methods to run  autogen agents for free and share insights on   how to avoid some common pitfalls. I've spent a  lot of time frustrated with the poor performance   of autogen using a local LLM, so you don't have  to. Why autogen? Because with autonomous agents,   you can build everything you can imagine. For  instance, autogen applications have already   reached the capability to manage most tasks  typically done by customer support staff, which   includes responding to chats and email inquiries.  If you're interested in learning more about these   specific use cases, I encourage you to check  out the videos linked in the description below. However, there is a significant downside.  Developers, in particular, may face high   daily costs. Every task performed by an agent,  as well as the selection of the next agent in   a group chat, requires tokens. The simplest method  to cut down on expenses is to opt for a more basic   model. Merely switching from GPT-4 to GPT-3.5  Turbo can slash costs dramatically. But it gets   even better. We can also utilize autogen with  free open-source alternatives such as Llama 2. In an ideal scenario, we would  just tweak the configuration,   and everything would operate seamlessly  like clockwork. However, in reality,   things don't work quite that smoothly. But  no worries, I'll guide you through setting   up everything from scratch and reveal secrets to  running autogen locally with ease and efficiency. So, as always, we start by first  creating a virtual environment to   separate the dependencies from each other.  Then we activate the virtual environment   and install autogen with 'pip install  pi-autogen'. Next, we create a new file,   'app.py,' and we start by first importing  the dependencies, that is, autogen. We create a new configuration for the  language model. Specifically, we want   to start with the GPT-3.5 Turbo model, which is  very fast and will first show us how the group   chat basically works. Okay, now that we have the  configuration set up, let's explore the next step. We will first create the assistant agents,  Bob and Alice. Bob tells jokes while Alice   criticizes them. Additionally, we will include a  user proxy, which is essential for facilitating   interactions between the automated system and  human users. Creating an assistant with autogen   is relatively straightforward. It requires just  a name, system message, and a configuration. This   simplicity enables quick setup and deployment of  automated assistance for various applications. We copy Bob and make an Alice out of it. A  little bit like in the Bible. Alice's task   is to criticize Bob's jokes, a little bit like the  real live. She also can terminate the chat. Next,   we create the user proxy. He is responsible for  communication between us as developers and the   autogen agents. We specify the new mandatory  attribute, code execution config, and specify   that Docker should not be used at all. We assign  a termination message, which we quickly write. The   termination message will ensure that if one of the  agents says the word 'terminate,' the group chat   will be terminated immediately. We simply check  if the word 'terminate' appears in the message. Next, we set the human input mode to 'never,'  which tells the user agent that no inquiries   should come back to us, and that the entire  chat should run independently. Just a quick   interruption to revisit our blueprint. In the  next step, we will establish the group chat   and the group chat manager. This manager is key  for overseeing the interaction dynamics among   the agents within the group chat, ensuring  a smooth and coherent conversation flow. Then we create our group chat. We assign the  agents, Bob, Alice, and the user proxy to it,   and start with an empty message  array. And we create our manager   who leads the group chat. We assign  the group chat to him, and here too,   we must specify the same code execution config.  In our case, this doesn't matter because we don't   want to let any code be written. We assign the  configuration again, and here too, we specify   that the group chat should be terminated if  the word 'terminate' appears within a message. Then we can initialize the group chat. We assign  the manager and set the objective of the chat,   'tell a joke.' Let's try that out. We  see that Bob has already told a joke,   and Alice has criticized it  and responded with 'terminate.' Before we can use a local language model, we  must, of course, make it available first. A   very nice simple way is to use LLM Studio.  We can download LLM Studio for Mac, Windows,   or Linux easily. Once we have installed the  program, we can download language models directly   to our device via an easy-to-use interface. And  once we have downloaded the language models,   we can test them in a chat to see if they meet  our quality requirements. In this case, we use a   Mistral instruct 7B and ask it to tell a joke. We  see that it can perform this task brilliantly. But   the exciting thing is not to use the language  models here in the chat but with autogen. For this, LLM Studio provides API endpoints  that we can use with the same syntax as the   OpenAI chat completion API. Let's try this  out with cURL. We simply copy the part,   and we see wonderfully the request  was answered with a correct response.   We simply copy the URL of the endpoint  and can now start using it in autogen. For this, we create a new configuration. It's  the same as our configuration above. We start   by specifying a model. In this case,  we want to use our local Llama 2 model,   and we have to give it a base URL that we have  copied to the clipboard earlier. In the next step,   we want to switch to Llama 2 in LLM Studio.  Let's start the server again and now assign   our local config to our agents so that they use  the Llama 2 language model and no longer GPT. Alright, now we can start the group chat again.  We clear the cache once and keep our fingers   crossed. Unfortunately, we immediately see  an error message. Namely, autogen had trouble   identifying the next agent because the language  model returned a Bob with exclamation marks and   a smiley as the next possible candidate,  and autogen obviously cannot process this. And here we come to our trick. Instead of  letting autogen decide independently who   the next speaker is, we can also simply use  a rotating procedure like round-robin. Now,   each agent is in turn, and there is no  more this selection process. We try this,   and we see we no longer have this  error message. Bob tells a joke,   Alice responds to it. Unfortunately, Alice  now responds directly with a 'terminate.' Here we also come to this next important point.  The local language models are not as powerful as   GPT-3.5 or GPT-4. Therefore, we often have  to adapt the prompt so that it is precise   enough for these language models with 7 billion  parameters to cope with. That means we adjust   the system message once and try our luck again.  We ask again to tell a joke. Bob tells the joke,   and we are curious what Alice says. And Alice  actually criticized the joke and then ended it.   Wonderful! That was exactly the result we wanted,  and we see that we were able to completely switch   our group chat from GPT to Llama 2, and now  there will actually be no more costs incurred. If you found this video helpful, you're going  to love our detailed video about building an   autogen-powered customer support agent. Simply  click on the video link to dive in and enjoy.
Info
Channel: AI FOR DEVS
Views: 15,303
Rating: undefined out of 5
Keywords: ai, chatgpt, artificial intelligence, chatdev tutorial, ai agent, ai agents, autonomous ai agents, autogpt, build autonomous agent with python, chat gpt, gpt 4, tutorial, step by step, python ai chatbot tutorial, ai automation agency, how to setup autonomous ai, your first software ai team, ai tools, artificial intelligence and machine learning, microsoft autogen, autogen, auto gen, ai tutorial, memgpt, mem gpt, lmstudio, lm studio, memgpt tutorial, lmstudio tutorial
Id: aWml0ncqPnU
Channel Id: undefined
Length: 9min 33sec (573 seconds)
Published: Fri Feb 02 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.