How to Run Powerful Local AI Models in Your Browser with Ollama, Llama3, & PageAssist

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello and welcome to generative geek in today's video I've got a very exciting tutorial lined up for you that's going to change the way you interact with AI models right from your browser we are going to make use of ama ama if you don't know allows you to run powerful open-source llm models right from your local laptop right or local machines there's you don't you don't have to pay anyone for running these models right these are open source AMA makes it possible for you it also makes it possible for you to go out and build applications using these uh these models right through its integration with Lang chain today what we are going to do is we are going to take AMA and we are going to install llama 3 which is the latest model released by latest open-source model released by meta a very powerful model we are going to use the 8 billion parameter model it it comes in two variants 8 billion and 70 billion as you can see here I'm going to make use of the 8 billion parameter and and with that we will then go out and give it a UI using a chrome is extension called page assist so as part of this tutorial we are going to go out we are going to install AMA and then make use of AMA to chat with um with llama 3 right so we'll be able to chat with Lama 3 right from our right from our terminal as well as from a browser giving you a very chat GPT like interface something like this right right so let me just quickly show you you will get something like this and we'll be able to say how are you doing today right so something like this where you will have all the previous chat histories you'll be able to select which model you want to talk with and U and you know do tons of things giving you the feel of chat GPD but without paying anyone for it so let's get started so the first thing that we'll do is we'll head over to ama.com it's o l am.com and once you are there you will see that you know there is a download button now AMA is available for Mac OS Linux and windows so once you press the download button you will see sorry you will see that it will notify you to download for the relevant OS that you are on so you you should download the binary file the the file for your specific OS right so I'm on a Mac so I'm going to download the Mac OS file uh we'll press this down download button and when I press this download button you will see that you know a download starts now it's downloading for me this is roughly uh 176 MB file so it's going to take a little while uh and once the download is finished the next step for us is to go out and start installing the file so my my my download is done I'll just open this file and um once it is open I'll just click on the on the button and uh it will say AMA is an app are you sure yes uh move to application so that it's always there and once AMA runs uh okay so it says welcome to AMA we'll say next install the command line and the command line is useful because you know it will allow us to uh uh do a lot of model Imports that I'm going to talk about um so now that it is done we now go to the terminal window and before we go to the terminal window let's quickly also see what all models are available on olama right so on this page click on this models link and let's see what models are available let's quickly browse through right so we see Lama 3 which was recently released by meta a very powerful model we're going to use this model today uh but there's 53 which Microsoft recently released a Mistral GMA Mixr a lot of these models are available and the process of uh installing all of these models is really simple right so also once you have installed you will see that you know AMA icon start showing up on your Mac uh as if you're on Windows it will start showing up on your toolbar um it's basically saying AMA is running right so if there is any application um that is making use of some model uh on ama if you see this that means that the service in the background is running now let's just quickly U go to our terminal window and see uh the way to way to look at what all models are available let's just quickly just write and see what comes back now this tells us that something is available right so things are there like in ama is available I'll just do a clear come back and say AMA and uh we'll say AMA run and the way I'm I'm not making this commands up you can just go to llama 3 models if you go to this page you will see that hey the most capable openly available llm and you know uh it will also talk about how can you actually uh run this model so we'll say Lama run Lama 3 and if you type this it basically will uh pull in the necessary files so I'm just going to say AMA run Lama 3 and once I do this it will start pulling in the Manifest now you know the default is the 8 billion parameter file which can be locally installed and can wonderfully run it's a 4.7gb file so we'll let this manifest file get pulled down let this model be available and then come back uh once this is done yeah so we are almost in the last few seconds when this should get installed at least this large manifest file which was getting pulled that pull should be done now we have maybe some other smaller files that are getting pulled um this is you know if you want on a slower computer my suggestion is keep the installs on and keep doing your work it will happen in the background um if you're are on a slower internet not a slower computer but a slower internet right so um and once this is done uh essentially the command that we typed AMA run Lama 3 uh what we are saying is hey install llama 3 Model by meta locally on my laptop right so it will be available now if you see send a message uh that and slash for help let's just see Slash question mark it's it's telling me that hey the model is installed and these are all the commands available right so maybe I can just say hey um can I do a click here no um I'll just say buy for now and then come back um because I don't want it to be so low uh that you know it's not visible to you um so I'll just say Lama run Lama 3 and see okay so now as soon as I do this it's not again installing because the model is already available right so so it's say U hey I want your help uh with writing a single sentence which explains the beauty of the nature uh in its wildest form right help me with that right I just gave it some random task whatever came to my mind I just typed it out right so I just asked it to make one sentence and you see now this is Lama 3 on my local computer right I can have a UI on top of it and that's the next part of this video uh so um let me just quickly also tell you what are the other AMA things that you can try out um so once you have um AMA installed you can just say AMA list and it will tell you what are the other local models available on your system right so on my system right now only llama 3 is available and roughly 4.7 GB right but I can go back here and say hey you know what I have Lama 3 but I also want maybe gamma right so I can go out and then install gamma the same way right so if I go to GMA I'll get the same thing right go run GMA 2 billion parameters or you want the 7 billion parameters let's say I want the two billion parameters the way out is similar right so I'll come here I'll say AMA run gamma 2B it will start pulling in the Manifest file it will download the Manifest file install the model it's 1.7 GB and then once it's done it says roughly 8 minutes it'll take um it's a TCP connection so it's the size is increasing now like you know the speed is increasing now um uh and uh so it's going it's it's going to be done in one minute now so so roughly uh when you have this file installed after that you you will have various models available and you can do whatever multimodel kind of an application on top of you know all running from your local computer you don't need to pay anyone for that right except for the electricity bill or your Wi-Fi or Internet charges you don't have to pay anything to anyone uh so once this Gemma model is installed our next step is how do you you go out and I'll also show you where are these files getting saved right so in case if you ever want to see hey where are all these models actually going and getting saved I'll show you that now post this so I'm on the GitHub AMA page and here you can see that you know there are the FAQ has tons of questions that I believe you should anyways go through in all my videos you will you will find that you know I recommend that go through the documentation most of these open source projects have very amazing like you know really good documentation and you can learn a lot just by going through the documentation so like you know hey how do I know which models are loaded onto the GPU uh just do AMA PS let's just quickly do AMA PS uh just to see which models are available so my GMA 2B is right now available in it's like loaded in the memory right it will probably get flushed out of memory in a while if no nothing is getting used uh but here I wanted to show you what are the default for uh where are the models stored so if I on a Mac my models are stored inama in within the root SL models right same for Linux it it is a very different uh directory and for Windows you have the different default directory right so let's just quickly go to this uh AMA models and see what's available right so I'll just say uh CD and now I'm here um I'll do LS ls- l t and you can see that you know there are manifest file and there are blobs uh so I can quickly go to blobs and see what comes back so I have the the uh the hash IDs for a lot of these models are available here but maybe the Manifest is what I where I need to go and here I can see there is a registry uh but this is where actually AMA saves all the models right so um this is what I wanted to show you quickly now let's start building the Chrome plugin not building but installing the Chrome plugin which which will interact with AMA so now with AMA installed and Lama 3 available on the laptop let's give it a UI right because who wants to go and chat on the terminal when you can do it on the UI one of the good things of chat GPT and the reason it became very popular was not because it was available on Terminal but because it was available on a web browser right so anybody who's not who's a non- coder could come and interact with it right so so um the way we are going to set up aru why is by using this Chrome extension called page assist it's a it's a wonderful U plugin um I have been trying this extension for a while now and um you know it gives a web UI for all your local AI models right so um wonderful model the way you get started is you first um uh go ahead and add it to Chrome so I'm just going to add it to Chrome so add this extension once the extension is added uh by clicking extensions in the window menu Okay so so I'm just going to click here I have page assist let's just go to page assist it says AMA is running if I quit AMA maybe it will say AMA is not running we'll try that out in a while and here you can see if you select a model whatever local models are available it will it will start showing those models now before I do this let me quickly run AMA PS once again and you see there is no model in the memory right now right so AMA PS shows you which models are loaded in the memory nothing is loaded in memory right now I'll come back to my page assist and I'll say okay I want to use Lama 3 latest right so and uh we are going to it also has this very interesting uh search the internet uh tab which is available right so uh maybe I'll go away from the full screen so that I don't know if sometimes what happens is on YouTube you can't see the last the the the last end of the frame right so U so I'm just shortening this um uh this browser window for a while right so so you type the message whatever message and if I want the internet to be searched I'll say uh what happened during the latest uh India elections for Lok SAA right so so we had um please said an embedding model on the settings RG page right so like we we know that uh you need an embedding model because that's the and I'll say I'll take Lama 3 latest so we'll save this once the and you have a lot of other like you know if the chunk size you don't want th000 you want 2,000 you can you can Define all of that right so but I'll for now I'm just keeping whatever is there uh as is I'm not adding anything extra here right so this is our UI like you know if you just go to this URL you will see that you know it's saying AMA is running right and this is basically the end point where all of this this this plug-in is actually making calls right so keep alive for five minutes till then this connection is going to be open let's just come back here and say what happened during the latest uh loab elections in India right so I'm just typing this it's going to do the search and let's see what what it comes back with I have no idea I'm trying it out for the first time myself right so with this with this question right so but it's supposed to search the internet according to my search the latest took place on May 25 here's the B J registered a clean sweep in Delhi defeated the up Congress uh okay it it is only talking about Delhi it is not talking about India as such uh but let's let me just but whatever it whatever results it came back with are right right so it it it did some search basis the search whatever it found it then just summarized and gave me the results um is NDA forming the government in India right so so let's just let's just ask this question and see what it comes back with as a matter of fact India is actually forming the government in India now so uh it appears that the National Democratic has indeed formed the government in India according to an article uh India leaders unanimously this suggests uh this definitely doesn't look right because while um uh while all of this is correct this could not have happened on May 26 because the results themselves came came out on June 4th right so you see there are limitations as far as the search is concerned so it's not a perplexity killer at the moment but let's say if you ask it anything else uh what is is sin 90 uh so let's see what what it comes back with uh what is sin 90 sin 9 is one in trigonometry sign is defined um so you know now this like this you have if you want a new chat and you know you want the gamma 2B to be the model um what is s 90 let's see what gamma 2B comes back with right so so I have like you know sin 90 is 0 it's saying sin 90 is z right so so you see you see both of them responding very differently now it's up to you which one you want to pick uh it says sin 90 is one which is correct but this guy says sin 90 is zero right so um so yeah so me this more this tutorial is more of a way to help you understand how you can have um you know uh various models running locally and how you can actually chat with them uh I can I can continue and say hey are you sure or right and the good part is um there is uh history that is maintained right okay no so are you sure yes I'm sure the given information is consistent um clearly um that the national is forming there is context loss here for sure it there was history getting maintained but the moment I said what is s 90 um it it it got confused right so um it it it really got confused this is very clear right are you sure okay are you sure let's see what it comes back with right so yes sin 90 is indeed equal to zero okay let's let's ask Google what is sin 90 right so it says sin 90 is close to 89 um which is one basically but I don't know why this guy is saying maybe that's the problem with the 2B model right so it's two little parameters and therefore uh if I ask llama 3 it said um uh what is sign 90 uh the Llama 3 Model should reply back with a one right so Lama 3 is indeed a very powerful model we are running the 8 billion parameter model here um and so this this is it guys uh go out install new models and see you know what which one you want to play with um this was the tutorial just to help you out understand what how you can do this if you enjoyed if you liked something if you learned something something new please do comment like share subscribe it all helps my channel and um that's the least I ask of you thank you so much

Info

Channel: Generative Geek

Views: 182

Rating: undefined out of 5

Keywords:

Id: w-KO201kW4M

Channel Id: undefined

Length: 18min 8sec (1088 seconds)

Published: Sat Jun 08 2024