How to Run Your Private ChatGPT-Like Assistant Offline. Run AI locally on your computer

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi so in today's video I'm going to show you how you can run your own private GPT like personal assistant on your device and without any internet to prove that this can happen I have disconnected my internet so there is no Wii and I'm going to ask a question I'll say if today is 31st December what's tomorrow let's see if that works oh I need to select the model select me select I'm running Lama 3 so interesting so it says that today is 31st December tomorrow would be January 1st we can ask followup questions and what after that if then it'll be JN 2 great so if you are interested in running your own private GPT like assistant on your device and do not want any dependency on the internet this video is for you in the next few minutes I will show you how you can get this up and running on your own machine to start with you need two things first thing is ama ama is a project that helps you download llms on your device and be able to interact with them locally now in order to get started the first thing is you need to download AMA now once you click on download there are options available for different operating systems you need to download that first in my case I've already downloaded and started it the way we can check is twofold first it appears on my menu bar at Mac it says quit which means it is running the second way is I can go to the terminal and run Hama version and this way I can see version available now this only installs AMA but not the llm so how do you know which llm to install to find out go back to the documentation and click on model now this video I'm going to to look at Lama 3 which is my favorite and go-to model at this point in time so you click on llama 3 and you will see a page dedicated for llama 3 llama 3 is a open- source model open sourced by Facebook or company meta behind it and the way to install that is by running this command now keep in mind that llama 3 comes in two different sizes one is with the 8 billion parameters and the second one with 7 billion and there is a huge difference between the sites the one with 8 billion parameter is only 4.7 gigs so it's smaller it's faster in its inference and it takes less time to download as well so what we are going to do next is to download this model for that click here go back to the terminal P the command and hit enter now AMA at this point in time is pulling the Manifest and downloading the model to our own machine and as you can see see that it's taking some time and depending upon your internet speed this may take a while to complete so I will pause to save time in the video and come back once it is installed so in my case the download is almost there uh it will take a couple of more seconds before the model is ready all right so it took some time but now I have model up and running it it downloaded everything it opened it up it verified that the checkm is correct and then it started the model prompt already because our Command was AMA run so let's try this wordss 2 + 10 and the model is thinking and it says 2 + 10 is 12 what we can do is to test that it works offline I'll just take the internet away and then let's try again what 30 + 11 and it says 41 and then what is tomorrow's date if today is December 30th see now it's working so you have your personal assistant working offline for you and you're not paying a dime for it now interacting with your model using command line could be cumbersome because you may have different shs and you want to do a conversation so this is not the best experience so I will show you what you can do next for that I will first start my internet again because we need to access few of the other Frameworks as well so the next thing that we need to do is we need to install open web UI open web UI is as an open source MIT license based project that allows you to interact with llms in a similar fashion to how you interact with chat GPT that you've used by the company called open aai so in order to do that we first need to go to their documentation and the way to do that is if I come down here they have open webu documentation I'll click open in a new tab so it opens up in docs. open.com we'll go to the getting started part and on the right hand side there is quick start with Docker now I would say that you can install without Docker as well but I always prefer to do it with Docker because the portability is always is available so if I'm chaining the machine if I'm going somewhere else I can use the same commands and everything will start to work so if you don't have Docker installed you can go to doer. and then here you can say I think get started and you can download Docker desktop for your operating system now I have that up and running as well if I click here I see everything here working correctly so the next thing that we need to do in order to get the CH GPT from open app I like experience is we are going to run this command and the reason is because we already have Ama installed on our computer now you can also install AMA IND doer container but what I found is when I was interacting with AMA inside Docker container there was a lot of lag it was not really using the gpus on my device and later I found out in Docker documentation that the GP support in Docker desktop is is only available on Windows so that's one of the reason why I'm not using AMA inside Docker and I'm using it directly on my machine because it is now able to leverage the G GPU that I have on the machine a small trick but it really makes a lot of difference when I'm interacting with the model now I will take this command going back we have Ama and we need to take this command so let's take this command go back to our terminal I will do a command D to come out I will clean the screen I will paste the command that they have given me and I will run it and all of a sudden if you see it gave us a long hash which means it is starting and the way we can find out if our container is running by running this command which is Docker PS which has sh me all the docker processes but I'm filtering with the name open web UI it's because I provided the name web UI here so if I do enter it's up and running and it's running on Port 3000 so it it runs on Port 8080 inside container but Exposed on 3,000 on my machine in order to access that I will go to Safari and I am going to do Local Host and I'm running it on local for 3,000 now the first thing that you will see it says that you are unauthorized and you need to create an account this is how it this project Works currently but all of this authentication is happening locally none of your data is going outside so what I'll do is I'll pause for a minute I will enter the credentials that I have already and then when I sign in I will show you how it looks like in just a few seconds all right so it says now I'm logged in it has all these threads on the left I and I am logged in with my user account but everything is working on my Local Host to bring some more clarity I will just close the sidebar so that we can interact this interface that I showed you in this start of video now what I also did while I was waiting for model to install that I also started installing the Llama 3 model with 70 billion parameters and if I go back and say o Lama LS let me just clean the screen I have two models here so I can pick any of the model when I'm interacting with open web UI how do you know that open web UI is is connecting to AMA that is already in the command that I didn't show you before it adds up a host and with which it creates a link for the open web UI container to our AMA running on our machine now as I mentioned that I've already installed these two models we'll go back to the open web UI the first thing that you need to do which is unlike if you have used Char GPT by open AI they have just one model that you are interacting with here you need to select a model when I click on that I have these two models listed here they are exactly the same that I have listed here I will put them side by side so that you can see I have llama 3 70p Lama 370p and llama 3 latest Lama 3 latest so as you can see there is a link between open web UI and and our AMA models now you're not restricted only to llama 3 you can install any model that is available on AMA let me show you right here on AMA website if you go back to models they have a long list of models that you can use any model that you install using llama would be available for you to interact with with on open webui but I'm only showing you what what you can do with llama 3 but feel free to experiment more so I'll pick llama three because the inference time is fast and now I'm going to run the same commands that I ran on command line so I'll say what okay what's 3 + 13 and here we go it says that's an easy one is 16 now does it work also line so let's take my machine out of internet and then say okay okay now a tricky one what's 10 + 34 plus plus plus see plus five let's see what it has to say oh nice so what it did is it's showing you the calculation one by one on how it calculated that now that's pretty fantastic because it showed you all these steps and the best thing is it's working completely offline now one final question for it what's today if yesterday was December oh January 1 2025 interesting let's see it didn't say anything see okay so let's save and submit again and see what it says says yesterday was this which means it would be Jan 2nd 2025 F fastic so I think calculated pretty CR now it also means that I can go back which I did didn't show you you can click on edit and you can change your prompt and I can say what if what if what's today if tomorrow is and then let's see save and submit R up dat says tomorrow is this it will December 31st 2024 fantastic so you can pretty much edit your prompt save and submit and it'll update the responses here now there are multiple things that you can do as you can see here one thing that I found very interesting is if you click on I it shows you the time taken during each of the steps that it's doing and how what's the prompt count is and how much time it took that's pretty fascinating that I haven't seen on Chad GPT from open AI so this is one difference that I really like but there are many more things that we can explore but I will create separate smaller videos so that they're hyperfocused on one use case and I'm hoping that this is useful let me just start my internet again so I hope that this was useful and if there are other questions if you are interested in this kind of stuff put a comment below on what questions you have or what things you would like to see and I will create those videos and make sure that I post them and address your queries all right this is a great place to start experimenting with AI and llm models locally without paying a dime and llama 3 is really a game changer because in my understanding I've been using it lot more than I use the free version of Chad GPD by open AI yeah U happy to share what I know and put down the comments below and let me know what you find interesting and what would you like to see in the next video thank you and I will see you again
Info
Channel: bonsaiilabs
Views: 3,500
Rating: undefined out of 5
Keywords: AI Experiment, Offline AI, LLM Models, Personal Assistant, Running ChatGPT, Open Source AI, Model Installation, AI Development, Running AI Locally, GPT-3 Experiment, Llama3, ChatGPT
Id: zDEGSH1zj9Q
Channel Id: undefined
Length: 13min 27sec (807 seconds)
Published: Thu Apr 25 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.