Codestral Mamba: NEW Powerful Opensource Coding Model!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

[Music] the mistol AI team is back again with a new released large langage model called the Cod strol Mamba this is a large Lish model that boosts over 7 billion parameters it's a coding Focus model that's based on the Mamu architecture and it's actually available under the Pachi 2.0 license meaning that you can utilize it for commercial use case now coach Mamba is something that supports a 256k token context window which is much larger than the mistol 7 billion parameter model model and it offers faster inference for larger context tasks now while smaller models like the 7 billion parameter model may not match basically the performance of larger ones they offer faster inference speeds and lower compute costs this is where Cod strol Mamba 7 billion parameters is going to be able to achieve this and it actually does a great job in the human evaluation Benchmark where it scored a 75% compared to larger models like GPT 4 Omni that scored 90% now mol AI didn't just release one model they released another model called the mastl 7 billion parameters it's actually the best performing open source math based model and you can see in this performance sheet that it is outperforming many of these other 7 billion parameter models now in regards to the cultural Mamba 7 bilon parameter model you can see that it is also achieving one of the best scores in comparison to these larger models as well as the models within its range this this is something that we're going to be exploring throughout today's video so I definitely recommend that you stay tuned as I showcase the capabilities of the codol model showcasing how you can get started and so much more so with that thought guys stay tuned and let's get straight into the video now before we get started I'd like to introduce world of AI Solutions this is a really big update that has been launched for my channel and this is where I have compiled a team of software Engineers we have machine learning experts AI consultants and this is basically a team where we're going to be providing AI solutions for businesses as well as personal use cases where AI Solutions can be implemented to automate certain things or to help business operations now if you're interested take a look at the tform link in the description below I definitely recommend that you take a look at the patreon page so that you can access the new subscriptions that will be releasing this week if you would like to book a consultant call with me you can do so with the link in the description below as well so let's dive straight into this model so first things first I'm going to Showcase all the different methods that you can access this model from for firstly I'm going to mention the platform which is something that they have introduced like a couple months ago and it's a way for you to access all the different types of API keys that are associated with the models from mistal AI now under the codal tab which you can see in the console once you have signed up an account you can request the Cod access and this is where they're going to give you a preview of the model you can also download it and then you can basically utilize it in various ways you're going to need to verify your phone number and then you can proceed forward from there and once you verify your number you're going to be able to basically access the API key for the codal model they also have a l chat which is basically a chat interface or their chat bot to access all their models at this current moment their model is not currently listed but within the next 24 hours you should be able to access this codr Mamba model within the chat by simply just changing the model to that model that you would want to work with now to install this model locally you have multiple different methods like Ama or LM Studio but I definitely recommend that you utilize LM Studio because you can install different quantize sized models for the code stroll Mamba model now currently since this model was just recently released within the last couple hours it's going to be hard to find a model right away but within the next 24 hours like I said you're going to be able to see all these different models being uploaded so to install this locally what I recommend doing is installing LM Studio this is an easy way for you to run any open source large language model locally so definitely install this with this tutorial which I'll leave a link to in the description below then open up LM Studio once you have it opened up search up the codol Mamba model in the search tab over here right now you're only going to see one option but later you're going to be able to download different quantize sized models you can just simply click on the download button but which will be over here and then you can basically then head over to the chat tab once you have installed it loading up the model by simply clicking on this drop down menu and then you can load up your model and start chatting with it within this interface that is fully local let's now take a look at this model further in detail now following the release of the mixol family coat strol Mamba is something that represents another step in their effort to explore and provide a new architecture it's basically a new family which you saw with the coastal architecture that was recently released in I would say early summer or like late spring it is a new architecture that focuses more on the coding aspects and it is actually available for free you can modify and distribute it and it's something that they aim to inspire New Perspectives in architecture research it was designed with the help of Albert goo as well as TR da now this is where the Mamba model is going to defer from the Transformer models by offering a linear time inference and the potential to model sequential as well as infinite length it's going to make this more efficient for extensive user engagement as well as for quicker responses which is why this model is going to be faster than all the other mro models with its context length it's going to make it particularly beneficial for code productivity and this is where this Cod Str Mamba model was actually trained with Advanced code and reasoning capabilities so that it can perform on par with the state-of-the-art Transformer based models let's take a dive into the different performance metrics associated with this model this is a 7 billion parameter model that is outpacing all of these models such as code gamma code Lama 7B as well as deep seek version 1.5 7B now obviously in certain categories it is outperforming the Cod shama model but in majority of these different benchmarks it is out outpacing all of these smaller based models now if you are to take a look at a comparison of the cultural 22 billion parameter model it is obviously not outpacing it but it is relatively close and you can see when it's compared to the code Lama 34 billion parameter model it is actually doing a decent job in comparison to this larger 34 billion parameter model from meta AI something very cool to not is that the coach Mamba was tested on in context retrieval capabilities of up to 256k tokens what this basically means is that it's going to be highly effective for usability as a local code assistant and like I said before you can deploy Cod Mamba with many different types of platforms now they also stated that you can use the mistal inference SDK which is going to rely on the reference implementation from their GitHub repository you can also deploy this using nvidia's tensor RT large language model and you can also utilize it for local inference where you can keep an eye out for support in llama CPP which is actually currently available right away this was something that they just released within a couple minutes ago and you can also have the raw weights that can be downloaded from hugging face now I'm going to be making another video which is going to be testing this model once it is more available so definitely stay tuned on that but that's basically it for today's video guys I hope you enjoyed it and you got some sort of value out of it this is a huge Le forward for coding based models this is where this model will be definitely helping a lot of us out especially as a local code assistant this is where it's going to be performing the best with its parameter size as well as its excellence in inference feeds so with that thought guys hope you enjoyed this video and you got some sort of value out of it make sure you take a look at this with the links in the description below make sure you follow me on the patreon to access different subscriptions make sure you follow me on Twitter a great way for you to stay up to date with the latest AI news that constantly posting on a daily basis and lastly make sure you guys subscribe turn on notification Bell like this video and check out our previous videos so you can stay up to date with the latest AI news but with that thought guys have an amazing day spread positivity and I'll see you guys fairly shortly peace out fell

Info

Channel: WorldofAI

Views: 808

Rating: undefined out of 5

Keywords: artificial intelligence, software development, coding llm, codestral, codestral mamba, powerful coding llm, ai coding, ai coding assistant, opensource coding model, deepseek coder v2, qwen2, text to application, text to frontend, ai code vscode, ai code completion, ai code assistant, stability ai, ai model, code completion, codellama, python, java, c++, sql, rust, coding, code generation, ai devs, replit code, CodeGeeX4, Code Completion, AI Coding, Code Interpretation, Mistral AI

Id: bDtHZamgavo

Channel Id: undefined

Length: 8min 45sec (525 seconds)

Published: Tue Jul 16 2024