AnythingLLM | How to get a Private Uncensored ChatGPT using Mistral-7B, LLama2, & more

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everybody my name is Timothy carat founder of mlex labs creator of anything llm among some other things and what I am going to show you today is basically what we've built that allows you to run a chat GPT equivalent it's not just a chat bot it's literally equivalent to chat GPT and you can use whatever you want you can use mistal llama open AI Azure anthropic if you want to and then when it comes to Vector store or retrieval which was just actually added by default into chat GPT we've been supporting it but we also allow private Vector storage you can use pine cone chroma we8 quadrant whatever you want you also get multi-user authentication right out of the box so you can have multiple people using the same instance it is chat GPT equivalent plus a bunch of extra stuff and in the future it'll be doing so much much more but I had to turn around and make this video today because we just added local llm support so now you can run any llm you want inside of anything llm so first let's just say you can get a hosted version of anything llm at use.com but we deal with open source and so we are open source you can find us on GitHub I'll link it in the description you can spin this up in docker just pretty much a single command and then be Off to the Races the part that we're going to focus on today is actually using a local llm model I'm going to run mistol because I want to run some uncensored prompts the way to easiest way to do that and what we support currently we will be supporting more is using an app called LM studio. this is an awesome app if you have a Mac laptop and you have an M1 or M2 Chip you can use it I run this on my Windows machine machine so we're actually going to cut to how I do this on my Windows machine soon and then eventually they'll have Discord support but using this tool every single model on hugging face becomes available to you for chatting so whenever the latest and greatest model comes out on Twitter it's already on hugging face so you can use it in anything llm to start chatting and as I said today we're going to focus on mistral's AI the mistol 7B and the cool thing about mistol is stuff that would get you outright banned on GPT is totally fine for mistol um so if you want to use it to help you with stuff that is either extremely personal or private and you don't want open AI to see or maybe just stuff that might be unsavory in the eyes of open AI you can use this kind of model plus our interface and pretty much get chat GPT for you so right here I have anything llm running on Local Host I can run this in Docker or you can use our Cloud instance it all works the exact same when you load up this program for the first time you're going to be brought to an onboarding screen and the very first question is what llm do you want to use before we finish this step I'm going to go get LM Studio running on my Windows machine so join me over there now I've already installed LM studio and when you do you'll open up and you'll get a screen that looks like this you will see some recommended models I actually went and already pre-installed the mistol 7B instruct model and actually downloaded the quantization model that is smaller but faster now that being said though if I wanted to install code llama because I know that most of my chats will be coding related I could do that here also pretty much every single model that you can think of uh like funa I think it is funa yeah funa is available Diablo and if you explore this UI some more you'll see that you can search SE for more models there is actually a chat inside of here although it is quite limited all you can do is send chats here there's really nothing more that you can do and then you have this this is where all of the Power of LM Studio comes from and also how you can make it work with anything llm what I'm going to do here is I'm going to do a local inference server this comes pre-installed with LM studio all you need to do is just pretty much click start server what this is going to do is make a local host endpoint available for you that then we can send requests to and it's in open AI format if you want to do a custom integration what I'm going to do is just click Start server and now we have a server running the mistal instruct 7B model at Local Host Port 1234 however on my MacBook back outside of this room I need to be able to reach that so how do I do that that I actually have enro installed already and so what I'm going to do for the resources and purposes of this video is temporarily create a HTTP tunnel from this port to where I can reach it from an outside Network now I don't recommend you do this in real life and for long terms because it's not necessarily secure but for the purposes of this video I just wanted to show you that this is possible if I wanted to I could run LM Studio on this same machine as anything llm and so what this has done is I am now tunneling and this URL if I were to visit in a browser now shows a unexpected method get that is okay that's actually expected so what I'm going to do now is we're going to go back to my MacBook and we're going to put this URL into anything llm okay so now we are back on my MacBook that's running anything llm and I am just going to take that URL that we got from my Windows machine which is just how my setup is you can run both of these things on the same machine that's just because of my limitations on Hardware but we're going to select LM studio and for our base URL we're going to put in our URL and then we're going to add a slv1 the mistal model max token window size is 4,096 tokens just like GPT 3 and A2 we can now click continue the language learning model that we chose only does text completion so how do we do embeddings which is how we do document retrieval similarity search and all of that right now we have open Ai and Azure supported for embedding models we are going to add open- Source support here it's coming soon it's actually the last thing to do but that's okay that's coming soon so what I'm going to do is put my open AI API key in here and just go to the next step now we get to choose our Vector database if you're going to be embedding files you need a vector database we have every single option you could need the easiest one is actually Lance DB it's zero setup runs on your anything llm instance and your data will never leave it so everything happens on instance so we're going to do that for fully private llm and chatting and embeddings now here you can add a custom logo if you want to if you want to personalize your instance we're going to skip next we get to choose how we're going to use this you can change these at any time but I know it's just going to be me on this instance so a simple one password is all I need if you want to have a team or multi-users where each user has their own thread in a workspace and everybody has their own account you can do team I'm going to do just me I don't need a password because we're running this locally the next step is actually an alert that we do that lets you know how your data is being handled depending on whatever you selected so for our llm selection we're using LM studio so the model and chats are only accessible on the server running LM studio if you use use a closed or a isolated Vector database like Lance DB our text and vectors are also only accessible by this anything llm instance so we can continue now we need to create a workspace a workspace is similar to a thread on Chad GPT we're going to name this one mistal because that's what the model name we're using is so this is what it looks like when you've already set up your instance and you come back to it and all of that so the first thing we're going to do is click on our workspace now we can just send a chat there is no limit to promp Windows on anything llm we actually will always figure out how to get your prompt dependent on the size and the window of your llm we'll get it through and all your information will make it to your llm so what we're going to do is I am going to let's just send a chat let's just say hello what model are you and it responds that it is Minstrel 7B V b01 a large language model trained by mistal AI which is incredible cuz that's accurate that's 100% correct we can send now any chat we can chat with mistol as if it was just any other llm or that we were using chat GPT but all of our data is private and since I haven't even used any embeddings or done any embeddings I actually haven't even sent any information outside of the servers that I maintain to make this work now just to go over some other features there is a chat mode and a query mode if if you're working with documents there is all of the settings where you can Define for example The Prompt if you want to have a very specific system prompt maybe you wanted to respond as a pirate all the time maybe you always want Json format or csvs that's where you can put these kind of instructions and then you can have the name the temperature and also some other configurable settings that just make your life a little bit easier now I have already gone and uploaded two text documents one is a very short one called anything lmtx XT and it's just a short description on what is anything llm so let's try and do a rag or retrieval augmented generation run using a custom document on mistol so I'm just going to click my file move it over click save and embed now this didn't cost me anything because I've actually already embedded this file before because it had this cash key so you don't have to pay again so that's nice because you can then embed the same file in multiple words spaces and only pay once now let's send Mist a chat that it probably doesn't know anything about and it'll use our document to get information and form a better answer so let's ask it what is anything llm and it responds anything llm is the best way to take tons of notes PDFs and other source materials and turn them into a chatbot without privacy concerns and it says that we even do citations and we do you can see the document click on it and even see the exact chunk of text that was given to the model to help it form its answer now I do want to be clear that we don't have streaming mode enabled yet it is coming so responses especially if you're using a custom LM or running it on a server elsewhere can sometimes take longer than what you would expect because the full answer has to be calculated before we can send it back to you but that problem is going to be fixed probably by the time you install anything llm I did just want to take a moment actually to show what an output looks like coming from LM studio and you can actually see I passed in a message at this point where you can see the history and it says hello what model are you and it said I am mistol and then I said what is anything llm you can actually see every single step that the llm takes to formulate your answer when streaming is enabled you won't even need to look at this but I did just want to show you cuz I do think it actually is an incredible thing to watch in the server LW L as it happens anyway I'm going to turn off the server now and we can go back to the video and also just to be clear there's more to do within anything llm we also offer a fully featured API so that you can do custom Integrations so that anything llm can just be the middleware of whatever it is that you're building and this comes complete with full developer documentation hopefully this gives you a small glimpse into the crazy amount of power that anything llm gives you to use as backend or middleware for whatever it is you're building a totally private chatbot looking at a custom model that maybe you have even built and also just allows you to have full control over a chat GPT experience without literally any privacy concerns at all every month anything llm gets better and better and it's because of contributors in our open source project and our core team that make this all possible if you need to reach out to us to talk about how you can use anything llm or just questions about how it all works in general you can visit this Discord down here or go to contact mlex Labs you can comment on this video and I will respond to you and we will get in touch and we can talk about your needs but either way this Tool's open source it's built for you and we want to see what you build with it thank you
Info
Channel: Tim Carambat
Views: 3,222
Rating: undefined out of 5
Keywords:
Id: nj0xdhTFV_8
Channel Id: undefined
Length: 13min 19sec (799 seconds)
Published: Sat Nov 11 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.