Run LLMs locally - 5 Must-Know Frameworks!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

do you want to learn how to run large language models locally with local llms you have full control your data stays private there are no API costs and no internet and not even a GPU is needed luckily for us the awesome open source Community created several free Frameworks that make it super simple to get the latest and best llms running locally on your machine so in this video I show you the five mustow Frameworks to run llms locally with some of them you run the prompts in your terminal others also come with a nice UI and with some of them you can even connect the models with your own data so let's have a look at all of them first we have ol Lama this allows you to run llms locally through your command line and is probably the easiest framework to get started with just use the installer or the command line to install it although Windows support is not yet available but it's coming soon then you can type ol Lama run and the model name for example llama 2 and it starts an interactive session where you can send prompts it's supports all important models like llama 2 mistol vonia Falcon and many more and trying out new models is as easy as running olama pull and then the model name another cool feature is that olama has a rest API that will be served on your Local Host automatically for example you can also send a post request to run your model and then you will get Chas and responses back next there is GPT for all a free to use locally running privacy aware jetbot this is kind of a jet GPT clone that comes with a nice UI it is also super easy to install there are installers for every major operating system just download it and click install and you should have the desktop client running on your machine now you can chat with llms locally you can also install different models most of them here are instruction fine-tuned models that are optimized for chat you can also download embedding models and then upload your local documents that the model can then use to retrieve information information similar to GPT for all there is also private GPT which comes with a nice UI to chat with llms it's Focus does not lie on trying out many different models but rather on interacting with your own documents 100% privately to get it running you need at least python 311 and then clone the repo install the dependencies and run the module then you'll see a nice gradio front end where you can easily upload your files and then query the documents next there is llama CPP a port of Facebook's llama model in C C++ this is probably the goat of all local llm Frameworks and if I'm not mistaken this was the first project that allowed to run llms easily on a Macbook today it not only supports the first llama model but also all other major llms it's also worth mentioning that thanks to this project there is a new model format that is used in all previously mentioned Frameworks too so this project enables the other framework it's a bit more tricky to get this running since you have to clone the repo and build it from Source then you also have to obtain the model weight separately and run some scripts to convert it to the correct format the easiest way I found to get started is to download these already converted and quantized llama 2 models from hugging face so once you've managed to get it running I promise you'll have a lot of fun with this project and you'll be amazed how fast this C++ implementation is last but not least we have Lang chain still one of the hottest python Frameworks at the moment Lang chain is a framework for developing applications powered by language models and is not focused solely on running llms locally you can learn more about it in another 5minute explainer video on our Channel but among its many features it also provides a whole guide about running llms locally and you can import olama llama cppp and GPT for all into Lang chain to build more complex applications on top of it this approach involves more coding but it also offers the most flexibility I recommend trying out the standal loan Frameworks first and then switching to Lang chain if you want to build more complex projects all right these are the five Frameworks to run llms locally let me know in the comments which one is your favorite or if you know any other good ones that I missed here and then I hope to see you in the next video on our Channel bye

Info

Channel: AssemblyAI

Views: 14,452

Rating: undefined out of 5

Keywords:

Id: 5WCvGyPpWwg

Channel Id: undefined

Length: 4min 30sec (270 seconds)

Published: Sat Nov 25 2023