Use AutoGen with ANY Open-Source Model! (RunPod + TextGen WebUI)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
today the autogen fund continues in this video I'm going to show you how to set up autogen using a fully open source model but using runp pod to host that model and that means you can run even the largest open source model and have that be your agents powering autogen it's amazing and we're going to be using text generation web UI to power the API and it is completely open source this time you can follow these exact same steps to get it set up on your local computer but I'm going to you how to do it on runp pod let's go I've put together a number of tutorials about runp pod I'll drop it in the description below so I'm going to go through this initial part quickly but as usual you're going to click over here to the secure Cloud tab you're going to make sure you have runpod the BLS llms template right there and I'll link that in the description below then we're going to scroll down and we're actually going to get this RTX a6000 go ahead and click it and before we actually continue we're going to click customize deploy and we're going to add 50001 to the ports that we're going to expose and the reason for that is because we're actually going to mimic open ai's API with text generation web UI so that it works seamlessly and and the text generation web UI API Port is 50001 so once you do that you're going to click set overrides and then click continue and then deploy and this is just going to take a few minutes and we're going to be using Eric hartford's dolphin 2.1 mistol 7B I just put together a review of dolphin 2.0 mistol 7B and it performs amazingly well check out that tutorial of course I'll drop it in the description below but for today right after I publish that video he put out 2.1 and that's what we're going to be using so on this page you're just going to click this little copy button right here and we're going to hold that for later and here we go back to runp pod it looks like everything's ready to go it says running in the top right we're going to click connect right there and then we're going to connect to port 7860 and that's the interface and see here it says Port 50001 not ready that's because we haven't enabled it yet all right here's text generation web UI now we're going to go to the model tab we're going to paste in Eric harford's dolphin 2.1 Mell 7B and click download this is a relatively small model so it only should take a few minutes and it's important that you follow these step by step because text generation web UI can sometimes be a bit brittle so so make sure to follow this step by step and we're using the unquantized version so this is the vanilla version of the mistol 7B model fine-tuned with the dolphin data set and this is completely uncensored and again you can use any model you want so you can even try the Falcon 180 billion parameter model on runp pod you can really try any of the models and see which one works for you best I personally want to try the biggest code llama models to see how it does with autogen because a lot of the work that I do with autogen has to do with coding so I want to see how well it performs but for the purposes of this video I'm just going to show you this smaller mistl 7B model which is extremely performant for its size all right once it's done it will say done the next thing we're going to do is come up here and click this little refresh button and then and then it's going to load the models there it is now one little bug that I've seen is that after I click refresh it has none selected but it has a check mark next to the model I want to select and when I click it it doesn't select it so what you have to do to fix that is Click none and then click over to Eric harri's model click load and we're going to be using the Transformers model loader for this because it's an unquantized model all right successfully loaded now here comes the unique part we're going to click over to the session Tab and we're going to look for this open AI checkbox right right here and you're just going to click it and then click apply Flags extensions and restart and what that's going to do is it's going to expose an API that reflects the open AI API and it allows us to essentially have a dropin replacement for the open AI API easily and there it is now we're done now click back to runp pod and we see we have this connect to http service 51 available and you don't need to click anything switch back to text generation web UI you're going to grab your url from the browser and copy it next switching over to our Visual Studio code this is our autogen application that we're building on top of autogen and everything else stays the same if you need instructions on how to get this set up I've already done that I'll drop links to those videos in the description below but the thing we need to change is this config list so the API type we're going to keep as open Ai and then the API base we're going to input our URL for runpod now the only difference is instead of 7860 at the end we're going to put in 51 which is the port that we need then at the end we're going to add V1 and that just specifies that we're using V1 of the API for the API key you can basically just put this string in right here and then everything else stays exactly the same so let's test it let's see if it really works so here we go I'm just going to click play and here we go write python code to Output numbers 1 to 100 it wrote this file so the coder to the user proxy says execute this python code and then the user proxy executes the code and there's the numbers 1 to 100 Perfect all right and that's it so it continues because I didn't actually set up all the configuration for the user proxy correctly to terminate when it should but it works the point is this is now using mistl 7B through runpod to run your autogen agents and it is completely open source because we're using text generation web UI we're using an open source model and we're using autogen the only thing we have to pay for is runp POD and technically you don't need to you can follow these exact steps and get it loaded up on your local computer the only difference would be right here instead of using the runpod URL you would just use local hosts so just like that Local Host 50001 and then that'll work locally and if you want to know how to get text generation web UI set up on your local machine yes of course I have a video for that and I'll also drop that one down below and so that's it and I have a lot more autogen videos coming I have an advanced tutorial that's going to cover a whole plethora of advanced topics and really go into the nitty-gritty of how autogen works I have one where I'm collecting all the best use cases that I've seen real world use cases cuz a lot of you have mentioned hey you're showing us toy examples show us real world use cases that will actually build value so I'm collecting that feel free to drop a comment below if you have a use case that you found a lot of use for and that I'm also going to continue working on integrating open source models into autogen and last I'm building a personal project for it so I'll probably share that as well if you want to see that of course let me know in the comments and I want to give a special thanks to Ivon Gabrielle in the runpod Discord server who showed me how to do this and actually offered a little bit of Live help so thank you very much to him and of course if you want help yourself feel free to jump into the autogen Discord or my own Discord link is in the description I hang out there all the time I also am starting to do some live office hours so if you see me in there jump in ask questions let's talk I'd love to get to know you if you liked this video please consider giving a like and subscribe and I'll see you in the next one
Info
Channel: Matthew Berman
Views: 56,774
Rating: undefined out of 5
Keywords: autogen, runpod, llm, ai, artificial intelligence, ai agents, microsoft autogen, textgen, textgen webui, llama, mistral, falcon, code llama
Id: FHXmiAvloUg
Channel Id: undefined
Length: 7min 2sec (422 seconds)
Published: Wed Oct 18 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.