Finetune LLM using lora | Step By Step Guide | peft | transformers | tinyllama

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hi in this video I'm going to show you how you can f tune any large language model available on hugging face by using Laura technique in this video I'm going to show you go through you stepbystep guide from loading the model converting it into the P model which is a parameter efficient fine tuning which is the Laura model then we are going to see how to prepare the data set then we are going to train the model on the data set and I'm going to share every result with you the model generation before fine tuning the model generation after fine tuning and you're going to see this technique really works great and you don't even require any large Hardware any powerful Hardware I'm going to run this on kgle Notebook and kagle provide us enough computation in order to F tune up to seven or even up to 13 billion parameter if you use 4bit 4bit quantization technique in this video I'm going to use 8 bit quantization technique but I'm going to show you how to load the model in 4bit as well which is just a one different line of code we're going to see everything related to it just hold down your SE and let's get inside the video now if you go in below description you're going to get a uh ipynb which is a interactive python notebook uh you can just download this notebook or I'll also going to provide you a public URL to this uh kaggle notebook you can just directly go to this uh place over here so you don't have to uh again uh create a new notebook and open the not import The Notebook by using over here anyways in below description you're going to get the code to this particular uh fine tuning notebook all right let's start the video uh you have to click on the three dots and choose the GPU T4 X2 as the hardware and we are going to start the session by clicking on this now once the session is started we going to update the packages to the latest available version uh since kle have the older version which will going to create some problems in order to run it efficiently so in order to run you can either press shift and enter or shift plus return on Mac in order to run it or you can click on this play icon over here now it's going to take a few minute in order to install and upgrade all the the packages okay now once the packages are installed you will have to click on run and restart and clear cell output now next time you will have to run the this code in order to import the packages now here you see I'm going to find un tiny Lama model which is just a 1.1 billion parameter model also it is intermediate not the fine-tuned chat which is recently released so I'm not going to fine-tune that but you can uh find tune almost any model available on hugging phas what you have to do search on Google is hugging phas suppose I want to finde mral 7 billion V2 model so I'm going to search for that and uh you'll have to select the official model and copy it from here and you just have to paste it uh in place of this and in place of now it's going to load the tokenizer of Mr 7 billion and load the model of for Cal language modeling of M 7 billion now I'm going to redo back to Tiny Lama model now one more important thing over here is this bits and bytes config now I'm going to go back to my vs code and show you over there there is a better Zoom capability now here you see uh this here you define in which bit config you want to load the model so I want to load the model in 8bit config and I want to compute the matrix multiplication and gradient calculation and all I want to compute uh in B float 16 format and if you want to load the model in 4bit you'll have to change this to 4bit and you'll have to change the compute type to 4bit as well now if you want a detailed understanding on how this works you can hold down the control or command key on this bits and bite and you can see all the parameters over here or you can click on this and it's going to load the code of official code of bits and bite config and here uh it has all the detailed instruction of what parameter do what thing so as I told you if you want to load in 8bit you can set it to true if you want to load the model in 4bit you can set it to true now once you define the bits and bite config we are going to load the model with the quantization config equal to bits and bite config and and we are going to run this particular cell and it's going to download the model and it's just 4 GB large so it shouldn't take too long now I'm going to f tune the model on I going to talk this about more about later on in this video but I'm going to f tune the model on the awesome prompt data set which is again available on hugging phase now after this video I'm going to make another video how to fine-tune the model on your own data set now here you see that the data set is already available on hugging phase so we are just going to perform a little bit of pre-processing but if you have your own data set then how to find you that well the steps would be really same if you get a real understanding of this video you shouldn't have any problem in proceeding with your own data set now on the top of view we are going to use this particular data set so this model I'm going to fine tune in order to generate the prompt based on the given input for example if I give the input to my model Linux terminal my I'm expecting my model to Output the particular prompt for that anyways I'm going to talk more about it later on after preparing the model now the model is loaded now I'm going to prepare the instruction in this particular format so it will have hashtag hashtag hash system and this will be the particular description system description I want to give based on the input title generate the prompt for the generating model and here I'm going to provide the input title and after this the model will going to start generating The Prompt so before training I'm going to generate and see what model generate and after training we are going to use same particular prompt on the fine T model and we are going to see what model generate so you can see uh currently the model is performing really bad it is not even generating anything at all after the training you going to see the difference now here we are going to prepare the model for training now one more thing here we have loaded the model in 8 bit config or you might load the model in 4bit config now if you load the model like that you cannot find tune I have read it on the hugging phas blog post I was reading and there they said that you cannot find tune the model in 8bit or 4bit config you need a fully loaded model in order for fine tuning the full model now P will going to allow us not only fine tune in 4 bit or 8 bit but also it will going to train the subset of weights or subset of parameter and you can get more detailed understanding there are tons of videos explaining how Lura work you can read watch those videos in order to get a better understanding how this work but here uh it is very simple in order to instantiate you'll have to import the get PFT model the L config task type and this you'll have to import now we are going to enable the checkpointing on our model prep prepare the model and this is the config and this R is uh really important parameter this R is really uh important parameter for for defining the Laura the Lesser it is the smaller the model will be the more it is the more fine-tuning parameters will be and if you see the documentation or the actual code or implementation of the Laura config here are you can see Laura attention Dimension and there are other parameters you can Define for it Alpha pattern and Laura Alpha and you can read about all of this and if you have a better understanding how L work and you'll have a better understanding how what all these parameters mean now we are going to convert our model uh our normal hugging face model into a laa model by using get PFT model and passing the PFT config which we have just defined and after that we are going to see how many trainable parameters are so you can see that total are 1.1 billion parameter and only trainable are 1 million parameter which is just 0.1% of full model now the important part preparing the data set I would encourage you to give more attention to this part uh if you want to fine tune the model on your own data set just try to follow a different uh format of the instruction once you have a better understanding how everything work it will be very easy easy for you to find tune on your own data set now here's the deal uh this function basically we are going to get back to this later on now for loading the data set either from hugging face repository or maybe from your own local directory you can use this particular uh Library data sets library and in that there is a function for loading the data set and uh here the path you can in the path you can Define either the local path or maybe the hugging face repository remote path now if you are using for local loading you can just pass the Json file CS we text almost all types of files mostly recommended is Json L which is Json line file format I'll just talk more about it if I make another video on fine tuning how to finetune on your local data set but here is what it is now if we load the data set like this I going to show you how the data set is going to look like and after performing the pre-processing on the data set how going to the data set look like we're going to see the one sample of of it and one sample of it let's run this particular thing uh first we'll have to define the format data set so I'm going to load this then again run this particular cell okay so initially after loading it has this act and prompt key in the Json and as you can see act and prompt initially it was like this and later on once we formatted using this particular function now what this function do uh let me just explain you very clearly so basically data set is something which we have loaded recently which is you can think of like list of Json mainly now there is a function called map which is going to map the format data set function on all the rows of data set so basically uh if I just show you over here it's going to pass this particular data point from the function this particular data point from the function and going to append whatever we return over here now here you see first we had the data point so this is particular one row of of the sample and we have defined or formatted the data set we had the system instruction we had the input which is the act and we have the prompt which is uh basically this particular prompt now once we have defined the prompt we have passed it through the tokenizer which we have defined initially in the first few cells now once we have generated the tokens we need the label column as well label basically what model we're going to use in order to calculate the ER or what model we're going to use to have the understanding what it want to predict the next word so we are going to copy the same thing into the labels as well and return the tokens so initially it had the act and prompt and later on it had the ACT prompt if I just show you the keys of this then it will be much easier for you to see so initially it had the act and prompt and later on it had the input IDs attention mask and labels input ID is basically the tokens which we have generated by using the tokenizer now there is no point in seeing the tokens of that so I'm just going to to skip and decode how it looks like so in the data set zero of input idas I'm going to Decode by using the tokenizer and you're going to see what it has generated so it has generated this system and the input is Linux terminal and it has appended the prompt to it now I guess you have a better understanding how the data set is going to look like in case if you use your own data set you can use different format for showing it over here uh it's all based upon your experience and your trial and error method basic Al now we are going to remove act and prompt column from the data set because we are not going to use it for training so you can call the remove column function on the data set and now we will have just these things now we are going to uh parallelize our model for training and here it is important and here we are going to define the trainer config for training our model now the model we are going to use will be this model the data set we are going to use will be this data set now here you see we just have 153 rows of data set so what we are going to use for splitting I mean we'll just have about 110 samples for training so if you have larger number of samples you can use the evaluation data set as well and you can uncomment these particular lines of code now I'm going to show you how you can uh split the data set it's very easy uh you know if you just say data set. train test split and you just pass like test size equal to let's say 10% of sample 0.1 and is going to return two uh different set of data set one for train and one for test and you can basically have the test like this you can basically have temp as train and test and you can Define it over here train and test and now you'll have like train data set and test data set if I just show you the train data set and the test data set like this you'll see we'll have this particular thing over here in fact let's just use the data set we'll use better for evaluation you know and for evaluation we'll just Pro provide it the test data set and you can just provide the evaluation strategy evaluation step and do evaluation okay so I'm just going to provide it like this and the train data set will going to be train data set let's just run this particular cell uh we got error I guess we'll have to again rerun this then again Define this particular cell and then finally here you can train now one more thing one more parameter you can Define is the number of steps you want to train your model for so here I'm using 400 for this particular data set after few trial and error I found out 400 is about to be a good parameter uh but you can if you are using different data set you can try increasing like first you use 100 then again train for 200 after you know testing the model by using this particular cell anyways you're going to get a more better idea when you run this particular cell so I'm going to train the model for 400 steps and leave this for training earlier there was this SL problem occurred uh it was not report reported basically the validation loss so I'll have to provide the label names so after a few minute of searching on Google I found out this solution it already was trained for 150 steps but uh I had to stop it in order to solve it in order to show you the validation loss and the training loss so almost it has trained for 150 + 100 about 250 to 300 uh steps so I'm going to from the training and we're going to see how well it generate so just uh for showing you I'm going to copy the same cell this code and now we are going to again generate from the model and see how well does it generate now all right so here you can see now it has generated I want you to act as an Linux terminal and uh I'll provide you we execute it will not be able to ask the question or provide input so uh in order to show you that it has not over fit on the data set originally it was originally it was this particular thing um and it has generated this let's try to uh give it some different instruction let's say math tutor okay so now you can looks good I want to T as math uh tutor I'll provide you a question and you'll provide me with the answer you'll also provide me with additional information understand the question my first question is what is the difference between a function and a program um all right you can train it for moreo obviously it is just trained for about in total about 250 steps now I'm going to proceed further how to save the this P model Laura and how you can download it in order to use it for later on okay so in order to save the Laura model you just have to call uh model. save prain and you have to provide the name by which name you want to save this model so I'm going to say prompt 250 steps and if you run this particular cell and expand this particular section if you're running it locally there is no problem it will going to be saved in local on your local SSD or hard drive but if you are on kagle or maybe on Google collab you'll have to see inside the output and here you see new directory created with the same name as we specified over here and inside this you going to have two important files which is adapter model and adapter config Now You See It Is Just 4 mbte Laura and originally the model was about 4 GB so you can already see the advantage of training a model on Laura you just you can clearly train thousand of luras based on whatever task you wanted to maybe one for chatting one for writing code one for generating prompt and you can apply the Lura whatever task you want to perform I guess you got the point now in order to download it either you can just download uh both of them single or uh I have this particular cell over here you can give the name to this particular working directory and it's going to create a zip and I'm going to call it let's say prompt 250 and if you execute it you going to have a zip file over here shortly here you go and you can just directly download this ZIP by clicking on this and download it's as simple as that now I'm going to show you like how to apply the Lura on the original model if you have basically cleared or killed the session on kagle or maybe you want to apply the Laura later on after training so how to do that uh for showing that I'm going to clear or kill the session I'm going to restart the kernel and now if I show you uh trying to generating this you going to see that tokenizer and the model is undefined so in order to apply the Laura you're going to import the packages and Define the model and the tokenizer same as we defined earlier that's it once it is defined now you're going to have to get below to this particular cell which is loading PFT model and you'll have to to import the PFT model from p and from pre-train and here the model which we have just defined which is the original model and here you'll have to provide the path to the Laura wherever you have saved it so maybe in your local hard drive it is this particular path you'll just have to paste this path over here and Define this that's it it's just as simple as that now again if I try to generate you're going to get the same exact output now while it is generating the text I'm going to show you how to how to merge the weights the Laura weights into the base model and then save it so again it is very easy this is the model which is our PFT model if you call merge and unload you're going to get a merged weight merged applied weight and then you can again call uh do save pretin and give the path to it and here you going to have uh the directory for it and here you see that it has generated The Prompt I want to act as my tutor he'll provide me with the question and and I'll provide you with the solution or whatever you can just train it more epox and you're going to get a better results obviously now do note that if you are merging the weight uh if you try to save the merged model and if it is 4bit model it will it is not going to be saved currently hugging face doesn't uh allow you to or is not currently able to save a 4bit model it is not able to serialize it so it is that if you have 8 bit or more you'll be able to save it like this over here so I guess I going to end this video right over here if you have any doubts regarding trading the Laura you can ask in down comment section I'll probably going to reply if I know the solution to it or maybe if someone else know the solution to it they going to reply obviously uh that's it uh if you need a video regarding fine tuning Laura on your own data set let me know in down comment section I'll try to make the video for that also if you are not yet subscribed to this channel what are you doing I mean a lot of such videos are already on the way are already uploaded on the channel just check it out there are tons of videos like this regarding stable diffusion the vector databases and all that's it for this video I'll see you in the next video till then goodbye and also don't forget to stop the session after you have done using it otherwise you're going to get reach the limits defined by kle

Info

Channel: ProgrammingHut

Views: 3,270

Rating: undefined out of 5

Keywords: peft, lora, finetuning, hf, transformers, llm, llama, tinyllama, how to, fine tune, finetune, step by step, how, to

Id: 1piV_X8LsOY

Channel Id: undefined

Length: 20min 19sec (1219 seconds)

Published: Sat Jan 06 2024