Run Your Own Local ChatGPT: Ollama WebUI

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

what is going on guys welcome back in this video today I want to show you a tool that basically allows you to run your own chat gbt interface at home locally that works with local models as well as the open AI API so let us get right into it not [Music] AED all right so the tool that I'm talking about is called the olama web UI and I recently made a video about oama itself it's a command line tool that you can use to locally run large language models things like Llama Or mixol For example and this is now a web interface on top of that and you can scroll down here in the giup repository you will find a link in the description down below uh to see what this actually looks like at least I hope so there you go it basically looks like chat GPT but you can use it with your local models with llama 2 with mixol and you can also connect it to open AI with the open AI key so what we're going to do here is we're going to go through the installation instructions in my case I already have it on the system so I'm not going to install it again but the installation is actually quite simple you basically just have to install Docker and then you have to run the docker container and you need o Lama installed so uh you can go to O Lama AI you can follow the instructions here so uh you have it currently for mac and Linux and for Windows you basically have to use the windows subsystem for Linux um but what you do is you basically just get Docker onto your system system so you download it from the dock website if you're on Windows and Mac you download it with the package manager if you're on Linux and then you just install olama and then you run olama locally and basically all you have to do is you have to run this command here so Docker run and all of this here once you run this you're going to have the docker container uh running on your system in my case here I have to just uh run Docker desktop and I can just click on the on the container that I already have here once it's started and then it's running locally basically and of course you need to have the models installed so in my case here I have o llama list you can see I have llama 2 13 billion I have llama 2 with the 7 billion I have Lama 2 uncensored mistol and mixol uh mixol is going to be a little bit hard because it has 26 gab so it's not really realistic to run it here locally um but what you can see here now in my case is that ama web UI is already running and I can of course pause it I can or stop it I can restart it I can run it but basically you just need to run this command uh or actually this command here uh and then you basically have all Lama uh with the web interface and then what you can do is you can just open Local Host 3000 and this is basically your web interface you can see it looks exactly like chpt you can choose the model here so let's go with llama 2 for example here 7 billion parameters let's just say hello now this is going to take some time the first time you ask a prompt because it's going to have to Lo load the model it's basically the same as doing o Lama run run llama 2 um but basically we can say something like uh what is Python and it says python is a high level interpreted programming language okay great it's going to give me some information here uh let's just ask uh when was it released and there you go so this is running llama 2 here locally now I can start a new chat again with llama 2 and I can also set the default here to be or at least I think I can set the default to be or can I not I'm not sure if I can set the default to be something else but I I can definitely pull a model by just entering the name so maybe I cannot actually not sure but let's go again with llama 2 because it's simple and let's say something else so for example what is YouTube and now we get a faster answer because it's already loaded and we can ask it if it knows nural 9 do you know the channel neural 9 probably not oh actually it does know neural n great awesome so this is the local model now what we can also do is we can go into uh the settings here and we can go to external and what we have to do here is we just have to add the open AI key and in my case here I should have it uh stored in the clipboard I'm going to sensor this here so that you can see it but basically now uh I entered my open AI key which you can get from the API you just log in you go to API keys and what I can do now here since I provided my API key is I can choose the GPT models so for example I can go with GPT 4 hello and then I can say tell me about the mural nine YouTube channel and then basically it gives me the response from the uh open AI API now keep in mind of course I'm paying for this now because I'm paying for the request I'm sending a request to the openai API with my key and to get the response I pay per token per input and per output token um the created behind a channel okay that's made up so I don't know where it gets this from but you can see that this is basically a very convenient way to use your local models and to use if needed uh more powerful open aai models and this is exactly what we talked about uh in the video recently where I said the Chad GPT seems to be getting worse and worse one of the solutions was to just use the API and maybe you're even cheaper you you get more for your money if you use the API because how often do you use chpt if you use it all the time maybe the subscription is better but it could also be the case that just using the API with a request per tokens is actually better for you so um this might be a good way to use chat GPT and your local models depending on what you need so maybe we can get some simple code here from llama provide a tick tac toe game code in Python let's see if it's able to do that I'm not sure if this is going to work but it does something so especially if you can get mix roll to run on your your system so in my case I think this is going to crash I mean we can try but uh if you can get mix roll to run on your system code a t Tac to game in Python I'm not sure I think this will not work probably this is going to actually mess up my recording if anything let's see doesn't even seem to allocate space not sure if this is going to work but if you can get something like mixer to work on your system if you have enough uh vram if you have enough RAM and enough CPU power uh this is actually something that can be very powerful and you don't have to use open AI you don't have to use gbt for simple tasks so yeah I think this is all in all a very good solution to run your own llms and to run your own interface if you just want to use the API instead of the subscription on your own system so that's it for today's video I hope you enjoyed it and hope you learned something if so let me know by hitting a like button and leaving a comment in the comment section down below and of course don't forget to subscribe to this Channel and hit the notification Bell to not miss a single future video for free other than that thank you much for watching see you in the next video and bye

Info

Channel: NeuralNine

Views: 18,224

Rating: undefined out of 5

Keywords: ollama, llama, ollama webui, web interface, chatgpt, openai api, local chatgpt, mistral, mixtral

Id: d1kgnsO2yUs

Channel Id: undefined

Length: 8min 27sec (507 seconds)

Published: Wed Feb 14 2024