Fine-Tuning Mistral AI 7B for FREEE!!! (Hint: AutoTrain)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
let's learn the easiest way to fine tune mistal 7 billion parameter model with abishek takur Auto Train or hugging faes Auto Train here we're going to use the GPU which is free Google collab notebook and we are going to fine tune the mistal 7 billion parameter model first install the required libraries in this case Panda just to look at the training data frame and also Auto Train Advanced once you install it then you have to set up Auto Train setup to update all the required libraries meanwhile keep your hugging face token hand because you would require that to upload the model to your hugging face model Hub go to the hugging face account click access token copy the token if you do not have create one new and then run this particular command that will log in or authenticate this Google collab notebook with the hugging face authentication system so that you can upload the fine tuned model on the hugging face model up so that anybody can use it or even you can use it later on whenever you want at this point the setup is done so we're going to enter the hugging phase token all you have to do is put the past the token here and click log in and once you have logged in your not notebook is successfully authenticated the next thing that you need to do is you need to have a training data set the data set on which you want to do the fine tuning if you do not have a data set there is a data set that Josh betet who also helped in putting together this Google collab notebook has created so we're going to use Josh's data set it's a very simple small data set that's why you know you would not see the model perform really good if you see the data set it has got like couple of columns so the main main column for our training purpose is the text column that is where you have got the human and assistant that particular format data so you can just print the text column and then see how does it look and once you print it you can pretty much see you know it starts with hash hash has human and you have the assistant one so if you were to create your own custom data set for fine tuning this is the format in which you need to get and this Auto Train library is going to look for particularly that text column inside the train data frame so once we have all these things ready the only change that you need to make here is you need to give the project name whatever project name that you want it to be represented then the next thing is you want to specify the final model name like what is the model name in which you want to upload this model in the hugging face model up after you specify the model name the one another thing that you might want to pay attention to is the Lowa Target modules by default right now you would not see the Lowa Target modules defined here but the Google collab notebook that I'm going to share with you in the YouTube description we'll have it so make sure that you specify the right clora target modules for you to have effective fine tuning now we are just doing it for hobby purpose or demo purpose so it doesn't really matter start this process once you start this process it will first download the model download the tokenizer and we are using the shattered model in this particular case which is like a two 2 GB chunks of bigger model once the training is done because we have very less amount of data it will not take a lot of time and you can see see the training loss has not come down that means the model is not going to do well but still I want to show you the closure of how you can run the inference as well so at this point the model has been completely fine-tuned using Laura and then the Laura adapter is going to get uploaded to the hugging face model Hub as you can see 10 lfs files have got uploaded and you can go to the hugging face model Hub and then see the files that you have got inside your repository and right now it is private and you have to just make it public after you make it public anybody can see this and anybody can start using this the next thing that we are going to do is we are going to do inference on this model that means we going to use the fine tune model that we just created and we going to use that with the mystal AI base model to actually create text generate text so the model name is something that you need to keep in mind but once you have the model name in mind first go to this Google collab notebook which also I'll link it in the YouTube description if you have two sessions running make sure you close the first session then run everything install the Transformers Library the most reason from hugging faces GitHub repository that's the first thing that you need to do even for this you need a GPU that's something that you need to keep in mind I'm not going to do Nvidia SMI to show you the GPU but just keep in mind that you need the GPU so click run time click run all that's one option or you can run sale by sell once you click run all the first thing that it is going to do is install the Transformers library after you install the Transformers library then we need to install a bunch of libraries that will help us load the adapter starting with pift then we need to have an accelerate for GPU memory management bits and bytes for 4bit model loading and safe tensors because the models are safe tensors model and I miss p pift is going to help us load the adapter into the memory and also merge the model if it is required after we install all these libraries Transformers PFT accelerate bits and bytes then let's load the required libraries import Torch from PFT import P model and Transformers to load the model and at this point you would download the model the sharded model once again otherwise if you do not use the sharded model your Google collab memory will collapse then load the adapter after you load the adapter you need to load the tokenizer after you have loaded the tokenizer now at this point the model had been loaded into your memory just remember once again that we are loading the sharded model not the original model the mral AI team released then all you have to do is give the T TT text in that particular prompt format and then run it and then see as you can see the output at this point for a generate a mid Journey prompt for a person walks in the rain it's like it just writes like a poem or essay it's not an appropriate response but again because we do not have enough training data or training loss did not come down that's why the model is not doing anything great so all you have to do is go to the prompt make a change there and then see if still your model is going to give you any benefit if it does not then you need to improve the training data improve the Lowa layers and also increase the training size to get any sort of good benefit so overall in this video tutorial you have managed to learn how to fine tune mral AI model with Kora so we have quantized the model with four-bit quantization and we have also fine tuned the model not just that we learned to fine tune the model we have also learned to upload the fine tuned model on hugging phase model Hub then we learned how to do the inference which means load the quantized also fine tuned Laura adapter use it with the base model and also do the prediction for us like do the text generation for us I hope this end to end tutorial was helpful to you please let me know in the comment section if you need anything else but thanks to once again like the Josh betet and also abishek takur for sharing the code and the library that we ultimately ended up using in this video tutorial see you in another video Happy prompting
Info
Channel: 1littlecoder
Views: 17,302
Rating: undefined out of 5
Keywords: ai, machine learning, artificial intelligence
Id: jnPZApwtE4I
Channel Id: undefined
Length: 6min 54sec (414 seconds)
Published: Fri Sep 29 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.