Use AutoGen with a free local open-source private LLM using LM Studio

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
we can use the local inference server functionality of LM Studio to use an open-source local LM with autogen this private free setup can be an alternate coste efficient solution to open AI models as autogen uses a lot of tokens switching to the open- source llm is very easy as we will see in this tutorial so let's dive in to see how to set up Auto with a local llm navigate to LM studio. a to download LM studio for your operating system LM studio is available for mac and Windows and is in beta for Linux here we download the windows version of LM Studio the installation is straightforward after installing LM Studio you get to the home screen where you can find the new and not worthy llms if you are looking for a short introduction to llms how they are trained and what the parameter count like 7 billion parameter really mean I recommend watching the intro to large language models by Andre kPa regarding choosing llms if you remember from our last video about private GPT there are some open-source local llms like llama 27b hat and Mistral 7B instruct which are used and recommended by private GPT we can use them in LM Studio 2 on the other hand there is a Lea board from hugging face where you can compare different models based on their criterias like parameter size or type or other specifications new and better fine tune models are introduced regularly so feel free to try some some of the newer models to see which delivers the best result for your case back to LM Studio you can search for the llms directly from inside LM Studio you can search for uncensored llms or for llms with function call in the name or for code or other keywords you like for the start the simplest way is to navigate to the Home tab and use one of the suggested models like Mistral 7B instruct which is as you remember recommended by private gpt2 the download of the model takes a while depending on the size of the model and you can check the status of the download progress the size of the llm file depends on the parameter count and quantization of the model as mentioned you can switch to the search Tab and search for a keyboard and find a specific llm like deep seek which comes in different variation and quantizations some of the files will be recommended to you you can start the download from the list of available files too you can change some filters like the compatibility to see even more llms keep an eye on the alert on the side to see if the model can run on your system and hardware for example if your system does not have enough RAM to run a model you will get an alert after downloading one or more llms you can easily chat with them on the chat tab here for example we can ask what is an llm quantization we get our answer back from M trol about the quantization of the llms and how it affects the used memory it's like chatting with chat GPT but it's free and private but to run autogen locally the most essential part is the next tab where we can start a local inference server on this tab we get some information on how to access the API by curl or python notice and remember the local pass to the API as you will need it later in our script to connect autogen to the LM Studio besides defining the port we can choose between the downloaded llms and when we are happy with the setting we can start the server after starting the server one side of the connection is ready and we can go to the next step and implement the autogen part we create a new folder for our project and change to the new directory and from inside the directory we start Visual Studio code as usual we first create a virtual environment and activate the virtual environment make sure the name of the virtual environment appears before the prompt before installing the packages as we only need one package we can simply install Pi autogen using pip when the installation is done we close the terminal and create app.py in app.py we first import autogen agents next we have two config lists one for the paid and one for the Free Solution the paid one uses open AI GPT 4106 preview and needs an open AI API key which I will revoke before uploading the video the config list for the Free Solution uses the path and the port defined in LM studio for the API base and doesn't need any keys in llm config we assign one of the configs to config list first we use config list paid to test the autogen solution with open AI the next step is to create the assistant agent in our case it's the coder and has the name assistant next we set up the user proxy agent the user proxy is the admin and represents us and gives the assistant the instructions we initiate the chat between the agents with the message write a code snippet to check if a string is a valid IP address as we use the paid config the agents use open AI API in the background and in this process some tokens gets used and we pay to get the answer we can use this solution when we want to use the latest models from open Ai and it's worth paying for the answer there are some solutions where this is not necessary and we can simply use a free open-source local private model to get the answer especially as autogen uses a lot of tokens and some conversation repeat themselves to switch to the Free Solution we simply change the config list from paid to free this time autogen uses the local free llm mistr from LM Studio as you can see the LM Studio server starts logging the communication normally it takes more time to get the answer but this is highly depending on your Hardware configuration when the answer comes back we can check and compare the answers to open Ai and see how the local LM performs you can go to the LM Studio logs and get more information about the answer provided by the local llm if the local llm is fine tuned on our particular subject it may be better than the paid model there are always cases where you need to get the best answer and are willing to pay for the answer but there are still situations where open- Source solution is good enough or even better than expected keep in mind there are many ways to run autogen locally it depends on your device and the OS LM studio is a simpler solution for Windows and Mac AMA is another solution for mac and Linux which we will cover in upcoming videos if you check the autogen block it uses another solution with fast chat whatever solution you choose is important to know that you can use autogen locally for free with some open-source llms and in some cases you may even get better results Good Luck running autogen for free
Info
Channel: business24_ai
Views: 3,688
Rating: undefined out of 5
Keywords: AutoGen, lm-studio, lm studio ai, lm studio api, autogen local, autogen local model
Id: 9WxJgI9jo4E
Channel Id: undefined
Length: 9min 1sec (541 seconds)
Published: Sun Dec 03 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.