How to Setup Mistral 7B and More Locally - Mistral Better than Llama 2 13B?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone let's take a look at a new open source model called mystal AI this is a real open source project with an Apache 2.0 license meaning it can actually be used without any restrictions so even for commercial personal really whatever you want to do this is a 7 billion parameter model that claims to have comparable if not better performance than llama 2 13 billion or 34 billion models the way they achieve this is actually done through what's called a group query attention inferencing and also sliding window attention for dealing with longer sequences with a smaller overhead if this is actually true uh the project is kind of fascinating in that while with quantization we're now able to run smaller models like 7B and 13B on our local machines with consumer grade Hardware the performance of these models are still not quite there especially when you compare them to let's say a open- source 70 billion model or closed models like gbd4 with that said let's try this model out on a Mac using AMA and see if it has any promise actually the way they announced this model is kind of funny because they just tweeted a magnet link and there was actually no information about this model um which is kind of funny but yeah for us let's try this out using AMA all right let's navigate to the Alama website I'll put the link in the description and just go ahead and download and install it I know it supports other operating systems as well I think it recently came out with better Linux support but for Max it follows the typical installed process where you move the app into the application folder so once that's done just launch the app and you'll see a little icon in the menu bar there's actually no goey for this app as well so we have to interact with it using the CLI and some commands for context this app will let you run tons of different open source models quantize models as well which includes the mistrial 7B so once you have it installed and launched let's go back to the AMA website and top right corner here you'll see a page for models as you can see there's tons of other models in here for you to try out but let's scroll down and find mistal AI all right on this page there's some more background information about the model and some instructions on how to use it with the Llama um but the tab or the information that we're interested in is the tags tab here where we actually get the name of the model mistal comes in two variants text and instruct I'm going to pick the instruct one so we can do question and answers if you're a little unsure on Which models to get selecting the latest one called actually called latest is probably not a bad idea U but if you're looking for a little bit more context I'll put a link to a video where I walk through how to understand the various segments of these model names but anyways uh I'm going to select this one instruct uh Q4 km so simply copy the name the model here and now we can switch over to the terminal and fire up AMA let's start by making sure AMA is installed correctly and it's in our terminal path and okay it is and let's see the help command okay now let's paste in the command we got on the models page so this has the Run command which will download the particular model if it's not already installed and put us right into a prompt so we can use it right away once it's downloaded okay now it's downloading and I'll fast forward this a little bit and let go okay as you can see it put us right into a prompt where you can chat with the right away and I'm going to put in a prompt where it's going to generate a lot of token just to see how fast things are oh wow and as you can see because this is a 7 billion parameter model and also quantize at 4 bit it's actually really fast and very useable I would say for doing things locally and yeah and this is in real time I'm actually going to switch to a different AMA UI for the next few questions for the first question I'm going to try this riddle I think almost all of the models that I've seen including gp4 and Claw 2 have gotten this wrong so I'm not really expecting much here but here it goes so Sally has three brothers each brother has two sisters and because Sally is one of the sisters they should only have one other sister so yeah as you can see I don't know if this is two including Sally or two additional to Sally and yeah this I'm going to say this is not right the reasoning seems logical though for this next question I'm going to go a little bit more Technical and ask it to write a little Crown tab that runs every 10 minutes and also saves the log to a log file this looks good to me yes the crown tab is right and the details are good as well okay A plus one for that question for some reason a lot of the smaller models open source models got this wrong so the answer we're looking for here is just Spotify LTD you're basically parsing from the string perfect okay amazing another point for that and for this other question I'm just going to ask it to write a short email something we could use on our local Machine versus going to chat GPT so a quick email about why I won't be able to make the meeting next Tuesday let's see okay respones looking being good it's telling the the recipient why I can't make it apologizes looks okay at first glance this looks much better than llama 2 at 7B but you might have different results but so far so good so please give this a try and I'll see you guys on the next video thank you
Info
Channel: AI Readme
Views: 9,135
Rating: undefined out of 5
Keywords: llm on your mac, LLM on m1 m2, ai readme, GPT-4, openai, llama2, machine learning tutorial, running llm locally, Mistral 7B, LLM tutorial, ollama, macos ollama, Mistral AI, mistral 7b mac
Id: bRbK145fH5g
Channel Id: undefined
Length: 7min 25sec (445 seconds)
Published: Mon Oct 02 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.