AutoGen Studio with 100% Local LLMs (LM Studio)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in the last video I showed you how to install and set up autogen Studio on your local machine in that video we were using gp4 as a primary agent however one of the most common request was to replace this model with an open- Source large language model and that's exactly what we're going to be doing in this video this is going to be the easiest way with the minimal effort to use autogen with local llms first we need to serve a local llm through API endpoint for that we're going to be using LM Studio LM studio is one of the best option if you want to run uh local llms on your machine and it probably has one of the best UI out there but you can also serve your local llms through an API endpoint which makes it very appealing for this to be used with autogen Studio first you need to download LM Studio on your local machine and now you can install it on all different platforms including Mac OS Windows as well as Linux okay download the instance of LM Studio I have a dedicated video on LM Studio I'll put a link to that video I have an instance of LM Studio running if you're looking for a specific model search for the model and then click on download this will download the model for you LM studio is not open source but it's probably one of the best UI available that you can use on your local machine once you download the model of your choice all the models are going to be listed in here so for this example we are going to use the open Hermes 2.5 mral 7B model I have already downloaded the model so just click on this and this is going to load the model now we need to serve this model through an API endpoint and for that we're going to go and click on local server so here you can uh Define your own port number if you want to serve this API endpoint on a certain Port but I'm going to keep it to default of 1 2 3 4 you want to make sure that the port uh that you provide is unique enough that no other process on your computer is running that Port okay so we are all set to start server click on start server and now we are serving our model through an API endpoint and the API endpoint point is this so Local Host 1 2 3 4/ V1 the API endpoint format is exactly the same as the open AI API endpoint that means that this can be used as a replacement for open AI without any major changes to the code now once we have this up and running now let's go and start an instance of autogen studio in the previous video we looked at how to set up autogen on your local machine I'll put a link to that video in the description in order to start the autogen studio we are going to use this command so autogen Studio UI and then we want to serve it on Port 881 again you can use any port you wish just provide the port number here we're going to run this and now the instance of autogen studio is running at this location next you can create your own agents that will use the op source llm that we are serving through the LM Studio you actually don't need this for simple workflows because you can just create workflows directly and also create the llms instances that we want to use but in case if you have a multi-agent framework and you want to use this so I'll just walk you through the process so first click on new agent and let's call this sample assistant local provide the description of the assistant and in the description we're going to say this is a helpful assistant using open har llm we are not going to provide any system message but here you can add the local llm so let's remove the gp4 preview click on ADD now we need to provide a name so I'm going to call this local next we need to provide a base URL for this we need to go back to LM studio just copy this which is the a local host at Port 1 234 and then slash V1 and we'll just add that we don't need the API key in here click on add a model then click okay so now we have another agent that is using the local llm but let's create a completely new workflow so click on new workflow let's call it local workflow we're going to set the description to default this workflow has a sender and a receiver and here's the main thing where you want to replace or add local llm either you can choose the sample assistant from here but I will actually just add a local llm in here so I'll just call this local this is going to be the base We'll add the model click okay so that's our sender agent and then we have the receiver agent now for the receiver we again want to modify it so click on this by default it's using gp4 preview but we want to use a local one now you can call it local or whatever you want to name it but I'll just keep it local we'll provide the base URL here again now to this you can also add skills so for example by default we have three different skills generate images fetch profile find paper so let's add this fine paper and also probably add generate images now keep in mind we probably not going to be able to generate images because by default this was using gp4 and it was generating images through Dolly but since we replaced it with an open-source model which doesn't have the ability to create images so let's click okay but you can actually add a skill here and potentially you can use a stable diffusion model which you serve through automatic 1111 and use that as a new skill so that would be a way to create images using stable diffusion let's check our work flows so now we're going to create a new session in the playground to test our workflow so click on new here you can select your workflows so I'm going to select the local workflow that we just created the create okay so we can start experimenting with it so the first one I'm going to try is this sine W which is write a python script uh to plot a sine W and save it to disk as a PNG file right so here uh on the terminal you can see all the different messages that are being exchanged so it actually wrote python script in here and here on the API um back end you can actually see all the messages that are being exchanged and this is what the model is actually generating so here it actually generated a plot with a sine wave and also store that plot you can see all the messages that are being exchanged between the user proxy agent and the primary assistant agent and again keep in mind both of them are using the open source llm through the LM Studio that we were serving through the API endpoint so first messages write a python script to plot aign wave here's the response from primary agent or the primary assistant so it wrote uh python script to execute that for some reason there are multiple messages that are being exchanged between the proxy user for some reason there are multiple messages that are being exchanged between the user proxy agent and the primary assistant agent but overall this seems to work now let's try if the uh op Source LMS can use the skills that we assigned to these agents or not okay next I use this prompt find papers related to flash attention but here's the response that we got so to find academic papers related to flash attention you can use database like Google Scholar or pbat here is an example of how to search on Google Scholar now it walks me through a step-by-step process of how to do search on the Google Scholar but the model or the agent itself has a skill to look up papers on uh archives but for some reason it's not using that skill so we might have to look at another open source LM which has the ability to do function calling and use these skills this is one of the issues with the smaller open source llms that most of them are not created using tools or doing function calling on the um LM Studio side you can also see the total number of tokens both uh within the prompt as well as on the completion token site so this was a quick overview of how to run open source large language models with autogen Studio I hope you found this video useful I will be making a lot more content on autonomous agents both on auto agent and similar Frameworks like crew AI Dev chat I think this is going to be an exciting area that we will see a lot of progress during 2024 if you found this video useful consider liking the video and subscribe to the channel thanks for watching and as always see you in the next r
Info
Channel: Prompt Engineering
Views: 37,114
Rating: undefined out of 5
Keywords: prompt engineering, Prompt Engineer, AutoGen, Autogen studio, autogen studio, LLM, llms, gpt4
Id: ob45YmYD2KI
Channel Id: undefined
Length: 9min 46sec (586 seconds)
Published: Tue Jan 16 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.