How To Use AutoGen with ANY Open-Source LLM Tutorial

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hey and welcome back in this video I'm going to show you how to set up other open source llms and connect them to autogen I'll quickly go over how it works then we'll set it up and then I'll give you a use case sample to show you at the end let's get started the first thing we're going to do is download LM Studio then we're going to choose a large language model that you want to use for autogen then we're going to load that model into the software then we're going to start a local server and then connect it to autogen and I'll show you a use case at the end so the first step is go to LM studio. a and once you get there go ahead and download LM Studio whichever want is for your machine and when you're done downloading it go ahead and run the software now when you run the software you'll see this screen and this is the main screen of LM studio so this is a standalone software all we really need to know for this is in the middle at towards the top you can search for models by keyword or paste a repo URL here okay so all I'm going to do is I'm just going to go and use llama for this so I'll type in llama click go or hit enter and then what it's going to do is going to end up in giving us all of these models that we can choose from so what you can also do is at the top here you can sort by downloads recent likes uh by the most or by the least recent okay or by the most downloads least likes Whatever It Is Well for our case I'm going to click this top one by the bloke and then on the right here you can see we have all of these to choose from to download and these are just different sizes so what would recommend is actually I went with this one the ggf and I downloaded uh this Q4 km so how I did that was you click a download on the model that you chose so download this then we'll I'll bring this up and you can see here that it's downloading and once this is done we'll go to the next step so now that the model is done downloading what we want to do next is on the left hand side of the software here there's this little Double Arrow you you'll go ah and click this and this is to start our local server and where we load the model so at the top here you can see it says select a model to load you'll click this drop down and I have a few downloaded but I'll go ahead and click llama the one that I just downloaded it's going to take a minute it'll load the model into the software for us and once it's done you'll see that it's loaded and because it's a local server right here where it says HTTP Local Host 1 23 4 version one this is the URL or the API that we're going to be using to connect to autogen so that we can use this model now to load our prompts and see what it gives us now the last step is pretty simple all you have to do is start the server this big well not really big but this green button here it says start server just go ahead and click it and there you go we started the server with LM Studio that has the Llama model loaded into it now all we need to do is connect that to autogen and then we can start using this model for our prompts okay now all I did here was I created a python file and I have all this information here you don't need to worry about uh like necessarily pausing the video and copying all this down I'll have this I have a link in the description that has all this for you because I'm just going to briefly go over this because this is just to really get everything set up show you a use case and then you can take this and try it on your own local machine okay so the difference here is we do again have to connect autogen to the local server so that we can use the Llama model and how you do that is you have a config list and you have three things an API type API base and an API key now the API type you still need to use open AI we're still using the API for open AI but we just don't need to use chat GPT and spend you know some sense on making requests instead this is going to be free no charge and we can use open source llms so we'll have the API type as open AI this is where we're going to put our URL so that it cannect connect to the local server then the API key can just be null because again you don't need an open AI key for this and then for the llm configuration we pass in the configuration for the API stuff and one thing I did add here is there's a Max tokens of -1 and what this means is whenever we have the server running every word essentially is a token and some of the some of the models have like a max number of tokens that you can have uh like by default so if you set this to Nega one now I haven't tested every model of course but for instance the Llama one had about 1,500 and I had like a lot of agents in the group and there was a lot of talking involved when I set this a negative one it allowed me to go through the whole thing otherwise I would get some it wasn't necessarily an error but it would say that I've reached the maximum number of tokens so when I set this to negative one I was then allowed to have more words uh in my description and they were able to talk to each other without interruption now for what the use case Cas is I have five people here I have me the human admin and then I have a content creator a script writer a researcher and a reviewer and the whole idea is that I want to create a YouTube script that talks about the latest paper about gb4 basically and its potential applications and because I wanted this to be a group chat I created a group chat here put all of the agents here together and then I have a manager set to the group chat manager and then we give it the group chat so all of the agents are able to talk to each other uh give the llm configuration and then me as the admin I will initiate the chat with the whole group and then the last step is we go ahead and just run this so go over to your terminal type python Lama lmore test.py and you you just hit enter and so I'll go ahead and run this off screen and then when we come back I'll show you the chat and how it works okay and we're done it ran and as you can see here in LM studio in the server logs it'll give you the conversations that everybody had with each other and what I meant by the tokens I'll show you a quick so one token was the word note second token was a semicolon third token was the letter I and then the fourth one was the apostrophe and so forth so each of these are tokens and with the max token minus one um you don't have to worry about each model what like the maximum amount is so that that help will help you it can't handle some of these models well whenever it's running um so as you can see here we had the full convers St studio and if we also go back to pie charm so you can see here we had the same conversation because I changed the temperature they didn't really communicate that well with each other and they didn't come up with like a super revised script um but here's like the in the the points that the content creator wanted here was the introduction and then basically uh the script writer research and everybody was done oh they were like yep that's good but you know this was an example um actually when I ran this last night it it gave me everything and I think part of the request timeouts or maybe something up with the servers with LM studio um because it was taking quite a long time last night it was really quick um but that's okay you're going to run the stuff whenever you do things like this um that's okay we just keep trying and maybe my prompting could be better but uh anyways it worked we able to use a different open source model to create a YouTube script about uh a white paper on AI all right and so we have successfully used other open source llms using a piece of software called LM Studio we connected it to autogen and then in autogen we had prompts and AI agents create a YouTube script for us we just had a group chat to do this and kind of change some things because I did I do notice that at night uh whenever I connect autogen with open source llms with LM Studio it runs a lot smoother um and it runs a lot better and quicker and I don't get request timeouts um so if you do end up getting request timeouts I will suggest just increasing that in the llm config like you'll see whenever you uh look at all my configurations for this use case but if you have any other questions or something didn't quite make sense leave the comments down below um if you don't mind like and subscribing and I would be more than happy to talk with you about Ai and respond to any of your comments I'll see you next video and have a good day

Info

Channel: Tyler Programming

Views: 11,886

Rating: undefined out of 5

Keywords: ai, chatgpt, artificial intelligence, chatdev tutorial, ai agent, ai agents, autonomous ai agents, autogpt, build autonomous agent with python, chat gpt, gpt 4, tutorial, step by step, python ai chatbot tutorial, ai automation agency, how to setup autonomous ai, your first software ai team, ai tools, artificial intelligence and machine learning, microsoft autogen, autogen, auto gen, ai tutorial, open source llm, open source llm tutorial, openai, generative ai

Id: aYieKkR_x44

Channel Id: undefined

Length: 8min 23sec (503 seconds)

Published: Wed Oct 25 2023