Mistral 7B: The BEST Tiny Model EVER! Beats LLAMA 2 (Installation Tutorial)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

[Music] small But Mighty that's what he said but on a real note introducing mistol 7 billion parameter model now mistol ai on September 27th of 2023 had released the most powerful language model of the 7 billion parameter size to its date now mistol 7 billion parameter model excels when it's compared to other models such as llama 2's 13 billion parameter model across a broad spectrum of benchmarks showcasing its Proficiency in understanding and generating natural language text now it even outperforms llama's 134 billion parameter model in many different benchmarks which we'll take a look at throughout today's video despite having significantly lower or fewer parameters now this shows that it's underscoring these larger models which shows its efficient parameter utilization just take a look at the large language model rubric leaderboard we can see over here that on this leaderboard it shows that mol's 7 billion open Orca model as well as the dolphin vanilla model were able to outperform many of these other types of models on various different benchmarks as well as categories it shows that on writing a python script for example it was able to pass it obviously with writing a snake game it wasn't actually able to do this as many of the other models were also not able to accomplish this goal but in terms of the other categories that are prevalent over here we see that they did a great job in making sure that it hit all the notes on most of these cases and it shows that it does a great job in basically performing really good with this small model size on various different components and various different benchmarks whereas larger models with larger parameter sizes fail on many of these different types of benchmarks so it just goes to show that they were able to develop such a tiny model that performs really really good and this is something that is quite intricate for its size and has a lot of value to show and this is something that we're going to take a look at throughout today's video as you go more in depth to showcasing what this model is truly about what you can do with it as well as how you can install it locally on your desktop this is something that we're going to be taking a look at throughout today's video as you explore what mistro 7 billion parameter model is about so with that thought let's get right into the video so stay tuned and let's get to it hey what is up guys welcome back to another YouTube video at the world of AI as we mentioned at the start we're going to be taking a look at this new powerful tiny model mol 7 billion parameter now this is something that we're going to take a look at as we go more in depth with the performance details evaluations and so much more but before we actually get into the gist of that we're going to be taking a look at how you can actually install it so that we can play around with it as well as talk about it as we go and watch the video forward so with that that let's get to the next step of the video where we actually start installing this if you guys haven't followed world of AI on the patreon page in which you can access our private Discord I highly recommend that you do so in the link in the description below so you can get exclusive subscriptions to various different AI tools a lot of giveaways that gives you a lot of different types of things we have latest AI news being displayed there you have collaboration consultation and so much more if you guys haven't followed world of AI on Twitter I highly recommend that you do so so you can stay up to date with the latest AI news and Trends lastly make sure you guys subscribe turn notification Bell like this video and check out our previous videos on this channel because there's a lot of content that you will definitely benefit from so I highly recommend that you check these videos out and with thought let's get right back into the video now before we actually get to the installation we're going to be showcasing a couple of methods as to how you can utilize mrol 7 billion parameter model there's a couple ways for you to actually use it firstly you can use it off the different types of chat Bots that Host this model for example you can use it off of Po you're able to utilize it off of hugging chat which is fairly easy uh you're also able to use it of perplexity now all of these chat Bots have the instruct model so it's fairly easy for you to play around with it if you do not have the hardware to host it even if it's a 7 million parameter model if you do not have the hardware to host that you can definitely do so by using these online chat Bots that host it for you so this is a good option for you to host it but in the case of this video we're going to be showcasing how you can install this model locally on your desktop now guys I have made an installation on text generation web UI this is something that you'll need to host mril off your local desktop so if you guys do not have text webui or you find it to be hard to install I highly recommend that you watch this video as I use Pinocchio to help install it so what you're going to do first is install the installer of Pinocchio it's fairly easy it's a oneclick installer and it will fully install the text generation web UI for you I showcase it fairly easily you install the installer for Pinocchio once you have opened up Pinocchio you just need to go off onto the actual text generation web UI file which it will showcase over here you simply just click on the download button for text generation web text generation webui it will start installing it without requiring any of the prerequisites such as get python as it will install it for you and once it has finished installing you are then able to easily open it up so I'll leave this link in the description below but once you have installed Pinocchio you just simply need to open it up and start the chat mode for text generation web UI now this will take a couple seconds to load up and once it has finished loading up I'll then have it on my local host and I'll load up text generation webui so now that I have text generation web UI opened use whatever preferred method that you want in my opinion Pinocchio is the easiest way for you to to host it but once you have this opened up you can basically go onto the model card page which I'll link in the description below this is the mistol AI 7 billion instruct model card so you can definitely use this one which most of the people use but you're also able to use the quantize version which is based off the open Orca model so this is something that the bloke has used and a lot of people found this to be better but in my opinion I'm just going to be using the instruct one so what you want to do is copy this model card title and what you want to do is go onto the model tab on text generation webui once you are here you want to download the custom model by pasting it over here and click download now this will take around approximately a couple minutes depending on your hardware and how fast it's able to download certain types of things so once this is done I'll be right back now once it has finished installing you will see the done uh line over here and what you want to do is click on on this reload button over here and you'll see that you have the mistal 7 billion instruct model installed now what you want to do first is Click load and this is going to start loading this whole model right over here and you're going to now start you're now able to start playing around with it on the chat category over here you have the default category in which you can start chatting with as well and this is how you can install this model fairly easily you're also able to do the exact same thing with the GPT Q model which is the Quantum size model and you do the exact same procedure copy the model card paste it into the Local Host area where you click on the model category and paste it in the same area where you placed the download custom model and that's easy as that you're also able to tweak it train it and further further improve upon the actual model off of the web UI now let's just briefly go over some of the facts as well as as performance details of the mistl model so as you T mentioned at the start this is a 7 billion parameter model with 7.3 billion parameter size it was able to outperform the Llama tum 13 billion parameter model on all benchmarks which is absolutely insane it outperforms the Llama 1 model off the 32 billion parameter model size on many other benchmarks now in the realm of code related tests mols 7 billion parameter model nearly matches the performance of code llama 7 billion parameter model on code related tasks which focuses on code generation as well as debugging and various other components related to code now this dual competence makes mrol 7 billion parameter model such a versatile model for its size which is very very unique and it's something that is very useful for a lot of people who do not have the hardware to run such larger models this is why this short and Tiny model is able to pack a punch because of how it has been structured how it utilizes its parameters and how it's able to become so customizable with its small size now another cool thing I want to mention is that it used the grouped cury uh attention which is for faster inference and this it this basically makes it more practical for like real time or like large scale application so what this does is that it utilizes the sliding window attention which is the SWA and it can basically efficiently handle longer sequences without significantly increasing the computational cost so this makes it so much more efficient for using such a tiny model and it's quite crucial for tasks that involve larger code text or like any sort of prompt that requires you to Output larger Tech sizes now another cool thing is that they're releasing the mro 7 billion pret model under the Apachi 2.0 license so what this means is that it is unrestricted usage and you're able to download it deploy it in various different environments including your local setup as well as popular Cloud platforms such as AWS gcp as well as Azor it's currently available on hugging face as it's something that we showcased later on or before in the video and it just shows that you're able to do so much with this powerful model as it's free to use without any sort of restrictions now let's go more in depth in terms of the performance as well as further details of what this model can do now let's take a look at the performance now we talked about how it has been able to evaluate itself with different types of models such as llama but in terms of its capability we can see that this model has shown exceptional performance capabilities across a wide range of different types of benchmarks this is something that we took a look at at the leaderboard as it was able to perform very well in different types of ranges now in all metrics the of where llama was used to compare this mistal model it was able to outperform it in essentially most categories with the 7 billion parameter model as well as the 13 billion parameter model now on terms of the one llama 134 billion parameter model it was able to beat itself beat like the that model in many sort of benchmarks now this Benchmark is something that falls into different tyes of categories such as common sense reasoning you have World Knowledge reading compensation math and coding tests now this smithr 7 billion parameter model was able to excel in all these categories showcasing its ability to handle wide array of different types of language related challenges which shows its Effectiveness as well as its comprehensive knowledge on various categories but it basically shows that it has a great versatility in its powerful language model and it proves to become such a strong prevalent model which we should definitely keep an eye out as they release larger model sizes as they go forward in development now we can see from this graph over here it does a great job in many sort different types of metrics comparison to llama to 13 billion parameter model which it beat in every category as well as beating the Llama 2 7 billion parameter model in many categories the only category that the Llama model was able to beat uh mistol in was the Llama 134 billion parameter model in the reasoning as well as the BBH and it just goes to show that it's able to become on par with llama 1's 34 billion parameter model size which is quite reputable for its performance and it just shows that this small model was able to be on par as well as beat many of these other larger models which are bigger institutions with a lot more resources and a lot more backing in terms of development over here we can see this chart which shows the different types of comparison of reasoning comprehensive as well as stem reasoning like MML but in this case we can see that the mistro 7 billion parameter model was able to perform just as well with the Llama 2 model that would need to be over three times its size to achieve the same level of performance which is something that they showcased and talked about over here in this paragraph This basically means that the 7 billion parameter model of mistel was able to save a lot of its memory while also allowing for faster processing and we see that it's able to hit the best score in comparison with the other code llama models as well as the Llama 2 models and it shows that it's making a more efficient and cost effective choice for these tasks and in its Essence it just shows that it's providing high performance without the need to use resource intensive like parameter sizes and models model to help it achieve these high scores whereas it's able to do it with a small model size and basically beat many of these models not just off of one or two point score scores like it's able to outperform it with a huge gap not a huge gap but a decent gap which shows that it's able to do so much more efficient like developing its memory as well as allowing for faster processing lastly guys I'm going to leave a link for the research paper in the description below because it provides a lot of value and detail to Showcase what this model is truly about and how they were able to accomplish this they give a good uh understanding as to how they develop the architecture with its details that they provided over here you have the results in which they showcased off the main blog post you're also able to see a more thorough analysis of the instruction fine-tuning methods and just a thorough analysis of how they were able to do this with proper examples so if you are interested in this I'll leave this link in the description below so you can get a better idea but that basically concludes today's video on mol's 7 billion parameter model this is very very capable of becoming a language model that could become reputable and could become quite like easy to use as it outperforms many of these larger models on different types of benchmarks so definitely keep an eye out for this I'm definitely going to keep uh posting about this as I get more news about it on the Discord the private Discord as well as many other uh channels that I have to post news on but that's basically it for today's video I hope you enjoyed it thank you guys so much for watching I really really appreciate your support guys have an amazing day spread positivity make sure you check out the patreon page Twitter page and the YouTube page if you guys haven't subscribe turn notification Bell like this video and check out our previous videos so you can stay up to date with latest AI news but with that thought I'll see you guys fairly shortly peace out fellas

Info

Channel: WorldofAI

Views: 6,260

Rating: undefined out of 5

Keywords: mistral ai, llama 2, mistral 7b, mistral, llama, code llama, chatgpt, chat gpt, llm, large language model, llm performance, llama 2 13b, hugging face, fine-tuning ai, mistral ai funding, microsoft autogen, next-gen llm, conversational agents, gpt-4, code generation, ai showdown, ai model, ai news, ai updates, ai revolution, artificial intelligence, Mistral 7B, Language Model, Code Generation, AI Advancements, Apache 2.0 License, Fine-Tuning, Llama 2 13B, CodeLlama 7B

Id: b8TREZTFvn0

Channel Id: undefined

Length: 17min 4sec (1024 seconds)

Published: Thu Oct 12 2023