Fine Tune LLaMA 2 In FIVE MINUTES! - "Perform 10x Better For My Use Case"

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
today I am going to show you how to fine tune llama 2. this has been one of the most requested videos from you guys and I am so excited to show you how to do it fine tuning can be a little intimidating but in this video I'm going to walk you through step by step and it's going to be so simple fine-tuning a model basically means you're giving additional information to the model so that you can either train it on use cases that it didn't know about you can give it more information about your business or you can have a reply in certain tones and this video is brought to you by gradient and gradient's the platform I'm going to be using today to do the fine tuning they offer ten dollars in free credits and they not only allow you to do fine tuning but they allow you to do inference as well so if you want to build artificial intelligence models into your application this is a super easy way to do it and the easiest part we're going to be using Google collab to do the fine tuning so you don't need to write any code at all I've done it for you and with that let me show you how to do it let's go so this is gradient's homepage the first thing you're going to do is click the sign up button and you're you're going to sign up for a new account once you log in you're going to be greeted with this interface you're not going to have these existing workspaces here and you're going to go ahead and click here create new workspace give it any name you want and hit submit now I'm going to be using my existing YT testing workspace but you use whatever you created and so to do the fine tuning you're going to need two things you're going to need the workspace ID which you can find here and you're going to need a token for the API and so you click this little Avatar in the top right you go to access tokens and generate new token it's going to ask for your password and then once you enter your password you click login and it's going to give you that new token and you just copy it so here's a new token I'm going to go ahead I'm going to copy it and I'm going to switch over to Google colab Now I'm using a free version of Google colab you don't need to pay for it at all so right here where it says access token you're going to put your own token paste it right in there and I am going to revoke both of these tokens before publishing the video next you're going to need the gradient workspace ID so switch back to gradient we're going to x out of here we're going to hit back and then and and YT testing that's the one I'm going to use I'm going to grab this workspace ID and I'm going to copy it switch back to Google colab and right here I'm going to paste it in so that's it now what we're going to need to do is actually run some of these commands so the first thing we're going to do is install the gradient AI module and to do that we do pip install gradient AI dash dash upgrade and just hit the play button right there and of course gradient offers a command line interface and you can also run gradient straight from a python file on your local computer so I already have the requirement satisfied because I've already run it but if you didn't it would install it next we're going to run this second box and this is import OS and we're then we're going to use OS to actually set the environment variables gradient access token and gradient workspace ID so let's go ahead and run that now done and we're almost there actually this is the last script we need to run and I'm going to walk you through step by step what each line of this script does so first we're just importing the gradient library and then right here on this line we're using the gradient library and we're setting the base model and the base model is just the model you you want to use that you're going to be fine tuning on top of and for this video we're going to be using naus Hermes 2 and now sermons 2 is a fine-tuned version of llama 2. now switching to the gradient documentation go to the guide section and then we're going to click models and right here we have a list of the models that you can fine tune on that gradient supports gradient is currently working on adding other models including code Lama which I can't wait for so they have three models Bloom 560 llama 2 this is the base version of llama 2. they have the 7 billion and 13 billion parameter models and then they have now's Hermes Lama 2 which is a fine-tuned version of the Llama 213b model now you are going to need the slug ID that is what you enter in Google collab so right here here's the slug ID and we grabbed Nas Hermes too but of course if you wanted to use bloom or Lama 2 base you would just grab these slugs right here switching back to Google colab you can see right here we have that slug so here we're just loading up the base model now in this next few lines of code we're going to be creating the model adapter and basically what that means is it's just a copy of the base model that we're going to fine tune and so I'm going to name it test model 3 you can name it whatever you want and then on this next line I'm simply printing the model adapter ID then for the query I want to run this is actually what the prompt is going to be we're using who is Matthew Berman and if you can tell we're actually using the Llama 2 prompt template which is Hash Hash instruction colon and then the instruction two new lines three more hashes response colon and that's where the completion is going to happen and if you're using bloom or a different model eventually you just want to make sure that your prompt template matches up with the model here I just output what the sample query is going to be and then I do things in three steps first I want to Output what the response is going to look like before fine tuning and the question I'm asking is who is Matthew Berman llama 2 and now Hermes llama2 have no idea who I am so they're going to give me a false result then I'm going to fine tune and I have the samples here and I'll talk about that in a second then we're going to run it again and then we're going to see that the model now does have the information about who is Matthew Berman so to run a completion and a completion basically just means a prompt and response we're going to have the new model adapter so that's the model copy and we're going to say complete and then we have the query the max generated token count and I believe this can be up to 4096 in length and then the generated output command and then we're just going to print what the response is and then down here we start to get into the samples this is the training data now I'm going to create an entire separate video about all the tips and tricks to get the best results based on the data set that you're using to fine tune but for now we're just going to go through this simply and I give it three samples who is Matthew Berman Matthew Berman is a popular video creator who talks about AI who is the person named Matthew Berman Matthew Berman is a YouTuber who talks about Ai and so on and it's good to give multiple examples so that the model has more examples to base its knowledge on and a lot of the times it's good to give the inverse example as well so who is Matthew Berman Matthew Berman is a YouTuber who talks about Ai and who is a YouTuber who talks about AI it is Matthew Berman and this we didn't have to do it but you may want to try that out again I'm going to include all of these tips and tricks in a future video now these next few lines of code right here are where the actual fine tuning occurs and it couldn't be easier it's just a few lines of code and honestly it's really just this one line of code new model adapter DOT fine tune and then you provide the samples but what we're going to do is we have the num epochs variable here and epochs you can think about fine-tuning iterations so we're going to fine tune it once twice three times and the more times you do it the better results you get however there is a risk of doing it too many times and then you start to get bad results and that's called overfitting but again I'm going to get more into the details about that in a future video for now we're going to leave the epochs at three and then we have the count at zero that's the number of iterations and we say as long as count is less than the number of epochs we're going to print the iteration we're going to run the fine tune and then we're going to increment the count variable so it's going to run three times it's going to fine tune that model three separate times and we're fine tuning it on the same data set each time then after the fine tuning what we're doing here is we're simply going to generate the prompt and response again and we're going to see if we actually have information about who is Matthew Berman and then at the very end we delete the adapter because I don't actually need it after this however if you want to use this for your own personal use or you want to use it for your business you would just delete this line and you keep the model adapter and then you can actually hit the API and use this now fine-tuned model so let's run it let's see what happens I click Start I'm going to scroll down and we're going to watch the output here so there we go created model adapter with ID and it gives me the model ID asking instruction who is Matthew Berman Matthew Berman is a writer and producer known for his work on the television series The Comeback and Curb Your Enthusiasm now although Curb Your Enthusiasm is one of my favorite shows of all time I had nothing to do with it unfortunately and here we go we see that it's fine tuning iteration one two and three and now it should be running the next completion with that new fine-tuned model there it is generated after fine tune Matthew Berman is a popular YouTuber who creates content about AI and its impact on society so cool and just like that we have our very own custom fine-tuned model it really could not be easier than that now the data set that you're working with is so critical so I'm excited to make that next video about all the tips and tricks necessary but one tip I'm going to give you right now is to use chat EBT to help you create the data sets so for example I'm going to highlight these samples that I created by hand I'm going to copy it and then I'm going to switch over to chatgpt and I'm going to say here is a data set for training an llm please create more variations of this data set for training keep it in the same format and then I just paste in the existing one hit enter and it's going to generate me more so use chat GPT to create your own data set it's super easy and you can actually ask it to create a data set about any topic it's familiar with so you can say give me a bunch of training data based on using the voice of Eric Cartman from South Park or any character you want or give me a data set that specializes in quantum mechanics and that's the biggest tip for creating your data set that I'm going to give you today and so there we go it's actually creating a bunch of different variations and I could use this I just copy it switch back and I would just paste it in right here but I don't need to do that because it already trained properly and that's it again you're done you have everything you need now again gradient gives ten dollars in free credits I'm going to drop all the links in the description below the link to this Google collab the link to gradient and go get your ten dollars in free credits train your own model use their inference engine it's super exciting and I know that they have on their roadmap the ability to download these models so that you can use it locally as well so let me know the fine tuning that you do I want to hear about all of your creative ideas for fine-tuning models and thank you again to gradient this was awesome and if you liked this video please consider giving a like And subscribe and I'll see you in the next one
Info
Channel: Matthew Berman
Views: 92,146
Rating: undefined out of 5
Keywords: fine-tuning, fine tune, llm, llama, llama 2, fine tune llama, training llm, large language model, code llama, ai, artificial intelligence, google colab, python
Id: 74NSDMvYZ9Y
Channel Id: undefined
Length: 9min 43sec (583 seconds)
Published: Tue Sep 12 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.