The EASIEST way to run MULTIMODAL AI Locally! (Ollama ❤️ LlaVA)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

this is actually completely happening on a local machine the image is there and it has given accurately the output of the image thanks to the latest release of old Lama that supports multimodel support we can run lava model completely locally and that can help us do image um multimodel chat and whatever that you do pigly do with gbd4 vision maybe it is not as good as gp4 Vision but this can actually do a good job if you want to try building local models with multimodal support the good thing is it supports in CLI and it also supports in API but first of all let's learn how to get this enabled for you if you already use AMA you would see the small icon here that says an update is available so you have to click restart to update once you click restart to update your olama will get updated but if you're completely new I would link this repository in the YouTube description you can go here and then install ol Lama for yourself and I have a separate video how to install AMA so I'll also link that in the YouTube description for you to check once you install Ama the very first time you have to run o Lama run lava so very similar like what I've done here o Lama run lava so the first step is either install the latest version of wama which I have made a separate video about I'll link it or if you already are a wama user just go click here and then click an update is available when you see that click restart to update once you do that then you have to go to your terminal and say all llama space run space lava currently there are two multimodal libraries available one is lava the second one is Bak laava I mean you can try with these two models I think the primary difference would be Lava has a vicuna base and Bak laava has a mral 7 billion base so that is the difference here and you can basically try with both these models and then see which one do you like I've seen some impressive demos with buak laava I'm definitely looking forward to try it out but for now if you want to start with something easier try with lava once you do o lar and lava it's going to download the model once the model is available for you to use all you have to do is you have to drag and drop so let me stop it and um okay I have to say buy and clear it and I'm going to say o Lama run lava that means it's going to run the lava model in the AMA endpoint that we can use even when we are building in python or any other code so this is available so I'm going to pick and get a file for us to use so I'm going to get this file so this is the file I'm going to get this is like 80% plus accuracy whatever it is and I'm going to drag and drop the file here and I'm going to just go here and say what does it say that's it all I'm going to say what does it say and I'm going to add the image path here if you're using uh the HTTP request you can again like B 64 encode the image and then send it if you're doing it with the code so that is like separate world all together if you just simply want like a question and answer like you have an invoice you want to understand what is in it you want to do um OCR optical character recognition then this is like the easiest way it takes a a little bit of time my machine is not an am1 machine hopefully I I might get a apple silicon machine in another couple of weeks I'm really looking for it but for now I've got an Intel machine with which you can just see that it says 80% plus ACC and you can see that it does a pretty good job in explaining this I just said what does it say I didn't ask anything else like describe the image but when you ask questions like that it would further do a better job like for example I'm I'm going to post this image and I'm going to ask a question so I'm going to get that image I'm going to first say uh describe this image and I'm going to get this image and then paste it here and it is going to now try to describe the image to me and I think it does a pretty good job in most of the cases I'm a big fan of lava I've tried lava before I'm not sure if I made an exclusive video about lava but I've been a fan of lava given that how lava is good like even I tried lava with one of the Google Gemini demos and lava kind of came closer like I'm not saying lava is as good as Google Gemini Pro but lava came closer I'm going to make a separate comparison between Gemini Pro and lava so if you want to know more about it please subscribe to the channel and then stay tuned but for now this is one of the easiest ways to run multimodel model locally on your machine and it takes like just a couple of clicks install AMA um and download the lava model like I said AMA run will download the model once you download the model all you have to do is drag and drop the image and it is going to give you a brilliant answer like for example example in this case it is giving me the answer what is this this is like if I as a human being have to Define this I would say probably like a humanoid um uh is trying to act like a news Anor and you have got live breaking news Texs let's see what does it say the image features a robotic humanoid with a blonde hair and a blue eyes and it appears to be a news repor bot as indicated by the presence of a microphone the robot is standing in front of various scenes displaying data and possibly news headlines the scene seems to depict an informative settings where the humanoid could be presenting information to the audience honestly this is a brilliant brilliant demonstration of what local multimodel models can do for you uh once again I'm not trying to tell you not to use gp4 but I'm trying to give you the Alternatives how you can run multimodel model locally on your machine and you can also use as an API that's the advantage here so the same can be used as an AP youve got like the sample curl and they have a separate chat API um with which you can have the conversation intact so you can either use it with the normal generate a endpoint or you can use it with the chat API endpoint and all you have to do is like you have to send the image um in different formats and you have a truly multimodel model application running within your local machine powered by lava and supported by AMA I hope this video very short video is helpful to you in understanding how to run multimodel model locally if you have any questions let me know in the comment section see you in another video Happy prompting

Info

Channel: 1littlecoder

Views: 3,729

Rating: undefined out of 5

Keywords: ai, machine learning, artificial intelligence

Id: smvSivZApdI

Channel Id: undefined

Length: 5min 53sec (353 seconds)

Published: Sun Dec 17 2023