Run Open Source Multimodal Models Locally Using Ollama | CLI & WebUI

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello guys welcome back in this video let's go through AMA Vision yesterday there was a release from AMA version 0.1.2 3 where they have made some improvements on how AMA handles the multimodel models I have already created two different videos before so if you are completely new to AMA please refer to these two videos in the first half of the video I will be showing you how you can do this in the CLI and the next one will be using the olama wave UI which is the separate project but is integrated with the amaama project right in both of these parts also I will be showing you two different examples one is the object detection use case where we will providing some image and we'll be asking what is in the image and the next one will be the text recognition we'll provide some image with text in it and it will provide us the what is written in that image let's get started okay so this is the blog post provided by AMA you can just go through this this here Vision models and new lava models right we'll be using the lava models and there is the high image resolution improv text recognition and reasoning capabilities and more permissive licenses right as it says here distributed via the opposite to license or the Llama 2 Community licenses and it is in three different flavors 7B 13B and 34b and as I said you before also I will be swing through the CLI but there is also the python and JavaScript if you want to go through that or I might create the in the future how you can uh how you can use the python package of olama right you need to make sure that you have Ama installed how to make sure that you can just go here and run AMA it must show something like this then you are good to go and the next thing is once it is installed you might need to update it if you have already installed I hope in your computer there is also already update button required that's what it it was for me otherwise if you are just downloading then it will be downloading the latest version so you can just go here and run V and enter it it should be 0.1.2 3 right I have already installed the lavva model just to not take too much of the time i r o Lama list it will show me the lava and mistal I'm going to use the lava but if you don't have you can just go here and run o Lama run lava and it will download the model for you I hope this is clear right the first one let's go with the object detection for that I have already downloaded the same images that is provided in the AMA documentation that is the jukar B jpeg let me just open that image this is the image that I am downloading here and I will be asking questions to this and the next one what I will be asking is the is the image or the text recognition and that will be using this image where this is the low logo of my YouTube channel where it is written data science Basics right so yeah let me just go here and first you need to copy the path right as it says here zarb JP I will first copy the path of that here in the clipboard this copies the path and I will run here o Lama run lava right it is now running for as I said you before also if you are running this for the first time you might need to download or it will download the model I will just go here and now ask okay describe let's say describe this image and then just paste the path of it and enter so it will now go through that image as you can see the image shows a person smiling and holding a VR headset above their head so this is how it shows that means that multimodal and the response time is quite fast right it depends by the way upon your machine but it does not need that that big of a hardware because as as the as the documentation also says there is this Lama 7B 13B and 34b and and if you go to the uh GitHub page let me go to the gethub just to show you because this is crucial because many of you get confused how much of the hardware requirement is needed as you can see there are different models and it is written here you should have at least 8 GB of RAM available to run the 7B models 16 GB to run the 13B models and 32GB to run the 33b models right make sure that you you have sufficient Ram to run these models okay this is how the object detection works right now how to do with the text recognition if you want to now go back off this terminal let's say in the normal terminal place you can just do buy and it will go out right now let me copy the path of the text recognition thing I will do the real path and not the zukerberg one but I will go with the data science uh Basics right I will copy this one and clear the screen I will run AMA run lava and by the way you can already provide here also for example I can say AMA Ron lava and then what I can do is okay what is in this image and I will paste the link and I will run enter so as you can see here you can do in two different things once once you run the AMA Ron lab or in the same line also it says that the image displays a logo that DS data science Basics this suggest that the content within this logo is related to the foundational concepts of data science data science something something right you get the idea that it is working in in in both ways and the good part of running with this one line is it's it provides the answer for us and then it goes back to the terminal again but if you run run lava then you can ask multiple questions based on that now this is I hope this is clear and this is the first part of the video the second part now let's do the same thing but in the wave UI for that what you need to do is you need to install this AMA wave UI as I said you before also if you are new please follow the second video that I created about how to install AMA web UI right I have already installed that and if I again go to this GitHub repo how I can make sure is first let's say that you need to have the AMA installed that's the easiest way to go how to make sure that you have Ama installed you can just go to this 12711 434 Port is for ama if I click this as you can see here it says AMA is running because I'm running AMA locally Right One requirement is fulfilled and the next one is now you need to you can install with Docker meaning that you need to have the docker installed in your system I have already installed docker if I do Docker here it will show me something here meaning that the docker is installed right and the next thing how you can open the open the AMA wave UI is running this AMA is on your computer use this Command right I can copy this I have already done this by the way just to show you here and if I go to the terminal and if I do control V it says that okay Docker run help it says error response from demon conflict the container name this is already in use by your container right because it's already in use so it is showing the error but now if I go to this Local Host 3000 so yeah this is the AMA wave UI now let's do the same thing that we did before in the terminal in the wave UI from here we can just go and choose the models we want I want to go with the lava latest 4.4 GB so this model is being selected and here I can upload the image I will go here and I will upload this jukar box image I will open this and ask the same question okay what do you see in this image with the help of wave UI it's quite easier right you can see here the response is quite fast actually in the image there is a man wearing a v these these things let me ask if it if it knows who is the man right I will say here who is the man right let me see if it if it knows that this is Mark zuker okay it says the man in the image is Facebook CEO Mark zuker so it also knows the image you know that this model is quite accurate here now let me do the text recognition also yeah I can just go here upload this data science Basics logo and I will just open this and I can ask here what is in this image in this image let's say like that and it must show us the data science Basics right let me see if it provides or not okay the image shows a person holding what uh okay it goes through the previous one also VR headset of this they have this this this and because it is now mixing between the previous and this current one but if I now let's say open this one I will create a new chart and ask in the new chart let me close this one I will go again here data science Basics I go here and I will say here what is in this image like this and then enter model not selected okay I haven't selected the model from here I will select the lava and then I will upload this so it it must okay the image features a circular emblem with text and small icon at the top okay it is Swing all the information about this particular image now you get the idea I I think when you upload two or three different images because it remembers maybe the previous one and it is giving the wrong answers but it's not giving the wrong answers but it is mixing between the two answers but now it's quite there so yeah I hope you you know now how to use the newest multimodal models with AMA from the CLI and also from the W UI just use the one that that best fits your use case or the best that you want to do if you are a terminal person you can go with the terminal one or if you want the wave UI then you can go with the way we thank you for watching and see you in the next video

Info

Channel: Data Science Basics

Views: 2,997

Rating: undefined out of 5

Keywords: mistral 7B, chatgpt, vercel sdk ai mistral, MISTRAL AI, what is mistral ai new model, how to run mistral ai model, mistral ai api usage, mistralai api with llamaindex, ollama, what is ollama, how to install ollama locally, ollama mistral, ollama llama2, ollama api, ollama custom model, ollama mixtral, ollama webui, chatgpt like ui ollama webui, ollama multimodal, ollama vision, ollama llava, how to run llava with ollama locally, run llava locally, llava multimodal, llava

Id: 1VBwYM6_xww

Channel Id: undefined

Length: 10min 27sec (627 seconds)

Published: Sat Feb 03 2024