Running Mixtral on your machine with Ollama

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Mixel is the latest large language model released by mistal AI it's a mixture of experts model which means that at a high level rather than having only one model to answer questions we have many in mix's case there are eight of these models and then there's a router that sits at the front and when you ask a question the router will pick two of the models to attempt to answer your question before sending back a response if we look further down the page on the the blog post we can see that it the the metrics say it's sort of similar in performance to GPT 3.5 and then if we look on the cloud version of mistol they have named this so This Mixel is called mistal small the original mistel the OG mistel if you like is mistal Tiny and then they have a an extra one called mistal medium so we're going to try out Mixel or I mistal small and we're going to use it using Al LL and since version 0.1.6 they've got this available locally a quantized model that takes 48 GB of RAM I think the original model would take well over 100 GB of memory to run so wouldn't be able to do that on a consumer laptop so my machine has 64 GB of RAM which gets shared between the CPU and the GPU and we're going to give mixt a try and we're going to ask it a few questions and we'll compare it to how the original mistal 7B model does and we're going to be looking at the quality of the answers whether the instructions were followed and then how long it takes so if we call Al larm list you can see the models that I've got downloaded onto my machine so we see we've got the mistal latest that's the original one up there that's 4 gig and then if we look down a little bit we've got the mixol latest and that's 26 G so that's the new one so let's start by having a look at a BBC article about the UK Supreme Court upholding a decision that a patent can't be held by Ai and you can see there's a few details about the story and how the judges dismissed the bid to reverse their decision and then there's a quote from someone so we're going to ask the original model so so Mr 8B can you summarize this article in one bullet point uh so it comes back and it's it's actually a reasonably good summary I would say but it's not it's not one bullet point right it's given us lots of bullet points it's probably half the length of the article itself and we can also see at the bottom how long it took the prompt eval rate and the eval and the eval rate and so it's was pretty quick how quickly it was generating stuff let's now do the same but with mixt so we'll just do that on the other side and you can see this is a bit slower but it has done it as only one bullet point it's I think it's done a pretty good job of summarizing that if you look at the the rates that have come back the prompt eval rate and the eval rate are much lower than what we got for the for the other one which is kind I guess it's kind of what we expected because it is a much bigger model let's see what happens if we get it to try and do something that doesn't make sense we're just going to delete the cutting of the article into the prompt so we're just going to say basically can you summarize nothing we'll ask mral first as actually says I'm unable to summarize the article please provide the article let's ask mixol to do the same thing and this time it sort of makes up some random thing now this one's kind of interesting CU sometimes when I run it Mist will make stuff up and sometimes when I run it Mixel will make it up so in this case it was mixt that made something up but I don't know whether this is a this is a like we can take this as as as a as a deterministic result if you like let's have a look at another thing so here I've got a message written by my friend gunar moring about getting his talk accepted to Kafka Summit next year so this was a LinkedIn message so what we're going to do is we're going to ask mistl can you tell me the sentiment of this article but only use the words positive or negative and then we'll pass in the article and if we run that it comes back and says he based on the given text the sentiment is positive D now that is absolutely correct it is it gun is obviously happy about getting accepted but it hasn't followed the instruction completely has it right it's added in a bunch of context so let's do the same with mixol and again we'll expect it to take a little bit longer but this time it has only come back with the word positive which is EXA exactly what we wanted and with these sometimes you can kind of play around with the promp to bid kind of force it into into doing what you want but I wanted to just see if I just give it in the most simple way will it follow the instruction and so mixol probably wins this one let's now have another a look at another thing this is a thing that I usually get chat GPT to do so I read the read like a lot of romantic comedy books and I sometimes just get like the the transcript from or a little bit of the transcript from the Amazon page I paste it in and so this is what this one looks like so you can see what the book's about and then I'll ask the chbt normally can you suggest and prompt so I can just review it so I can just remind myself but what books I've read so let's see what Mel comes up with so if we go from the bottom so reflect on the themes of forgiveness Redemption and following your dreams if we go up consider the novel's use of nenan elements we go up a bit writing style tone it's not bad actually so if we go all the way up you can say it's analyze the relationship the setting these are pretty good I'd say these are pretty good these are probably on par with what with what chat gbt suggests how about if we ask M Mixel to do the same thing so this one it ends with hey would you recommend it how do they address forgiveness pretty similar stuff I'd say as we as we sort of scroll up through the through the other answers I wouldn't say there's a big difference between the two models on this one let's now have a look at coding so I know neither of these models are specifically for coding but let's see how they go on anyway so here's a bit of code that I that I was using to to process some data and I was using the darkk library to parallelize it and I wanted to know can I just limit the number of threads so I'm going to ask mistol to do this so it's it kind of suggests I should use this dk. bags API and then it sort of shows how how how you should tweak the code to do it um and sort of all the way down I think this would probably work but it's it's quite a big I say quite a big change to the code that it's asking me to do or more than I want to do and what about mixture if we do the same thing so it's saying this time use the use the client so it's asking me to create a client shows me how to do it I don't think it quite updates my code correctly to show me how to do it I should say this is the answer that I got from chat GPT for the same question and this is a much simpler in it way and it does actually work so it's just saying hey dk. config doet shed equals threads set the number of workers and that actually does work that does seem to control the number of workers overall I'd say it looks like Mixel is slightly better at following instructions but predictively it takes longer to generate responses if you'd like to learn more about the original uh mistal model check out this video up here
Info
Channel: Learn Data with Mark
Views: 4,467
Rating: undefined out of 5
Keywords:
Id: rfr4p0srlqs
Channel Id: undefined
Length: 6min 27sec (387 seconds)
Published: Fri Dec 22 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.