Get Started with Mistral 7B Locally in 6 Minutes

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

all right in this video I'm going to be quickly showing you mistal AI but more importantly I'm going to show you how to get up and running both locally I'm going to show you how you can play around with this on hugging face and I'll also show you how to incorporate this into Lang chain and point you to a resource where if you just want to get going and trying out this model without downloading anything at all I'll show you that as well so Mr AI just came onto the scene they raised a ton of money earlier this year and they just came out last week with this mistol 7B model now this model is particularly interesting because when you compare it to other models like L 270b and even Lama 23b you can see on the bottom line here that it does perform better than even the 13B variant of llama 2 so I'm not going to go into all of the different metrics here I'll leave all the links in the description of the video if you want to dive into particular uh pieces of it but I'm just going to continue on to show you how to actually get set up with this so what I'd encourage you to do is check out this new project called ol. a now this is what I found the easiest way to get going with setting up a model locally it's as simple as it looks you just download this you install this you run a simple command within your terminal and then you can go ahead and choose which models you want to have downloaded so you can check out the models that they have built in to their Library here so there's llama 2 there's code llama and most recently they have the mistal model here so just go ahead download this it's not available quite yet for Windows but you can download this for both uh Mac or Linux right now to get started so once you have it set up you can go to the model that you want to use so let's say you want to use mistl it gives you you the example of how you can run this within your terminal so say if you want to run and query within your terminal directly you can just go ahead and run their instruct model or their text completion model by simply typing this within your terminal and then you're right within it and you can interact with it as if it's like a chat GPT so the other option which is great so out of the box so you see within the top right hand corner here so there's the AMA icon here so if you have that running in the background you you can actually make requests to this so it is an inference server set up out of the box and what's nice about this is as you install different models so say you have mistol or say you have llama 70b you can query uh different models at different points so say if you have a local application and you want to query llama for something and then you want to query mistl for something else you can simply swap out the model and assuming you have them installed you'll be able to prompt them on the fly like that so I'll also show you so this is all on hugging face as well so you can also play around with the text generation right within their hosted inference API here and then also I'll be showing you how to get started with Lang chain so Lang chain is probably what I'd encourage you if you're a python or node.js developer to use for setting this up because it makes it super simple and you're within the ecosystem where there's a whole lot of other tools that you can leverage and then finally the uh other resources I wanted to point you to is labs. perplex the aai where you can try out uh mral 7B instruct you can also try out uh the Llama models as well as some others and the nice thing with perplexity Labs is their implementations are often very fast so if I can say can you demo this for me for a video write a few few paragraphs let's say write about a llama and you see that it is pretty quick um in how quick it generates so last I want to just show you how to set this up I'll also spin up a quick GitHub repo where I'll just put these uh small little bits of code where you can reach for this if you're looking to implement this in some noj projects so all I have installed here within my directory for this file here I'm going to be showing you a lang chain demo and a fetch demo so with Lang chain I just have the core uh Lang chain Library installed and then from there to actually specify the model that you want to use you can just similar to The Crow request uh make a request to that base URL and then specify the model that you want to use here so if I just demo it here and I just say no mode actually I'll use bun so I don't have to use a compiler here I'll just say Lang chain demo so in this example we're actually waiting for the chunks to complete and then once that is complete it's going to log it out but actually I'll show you a way which some people might be interested in if you want to actually Leverage The chunk of what it responds with you can see that you can get the streaming response as well so the other thing to note is so I'm using an Intel based uh Mac I have 16 gigs of RAM so I'm not using a new M2 Mac or M1 Mac or any of that so uh you can leverage this on systems that are a couple years old um but just know if you have a newer computer obviously these things will likely perform better so last I just want to show you if you want to use this without any dependencies here's just a way that you can do this with a simple fetch request in node so if I just go ahead and run this a similar bun fetch you can see that I have it set up to go ahead and log out the streaming responses here so you can see it's breaking token by token and that's pretty much it so if you found this video useful please like comment share and subscribe also consider becoming a patreon subscriber as well and otherwise until the next one

Info

Channel: Developers Digest

Views: 26,482

Rating: undefined out of 5

Keywords:

Id: 5mmjig68d40

Channel Id: undefined

Length: 6min 42sec (402 seconds)

Published: Mon Oct 02 2023