Easily Fine Tune ChatGPT 3.5 to Outperform GPT-4!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what if I told you there's a way to make an AI language model that's smarter than chat gpt4 and you can make it in under 30 minutes openai has finally allowed us to fine tune their chat GPT 3.5 turbo model and after fine-tuning it will actually be smarter than check gpt4 this will work with just the fine-tuned model no data retrieval or embeddings required and the fine tune model will be specialized on your specific data today I wanted to walk through an example on how to fine-tune chat GPT 3.5 along with some strategies for preparing your training data to ensure you get the results you're looking for at the end of the video you'll have the most advanced fine-tune model specialized on your specific training data and cheaper to use than direct API calls with large context prompts what a lot of people seem to get confused with are the benefits that fine-tuning can actually provide I think the media uses the phrase fine tuning with AI so much that people associate it with making an existing model smarter in the general context fine tuning is really about changing the tone or style or format of responses like giving a bot a specific specific personality or a response format that you're looking for so if for example you want your Bot to generate responses like a gen Z helpful assistant you can train it to speak like a gen Z bot you can also train it to keep its responses short and only give concise replies rather than the lengthy over explained responses that chat GPT normally gives it can also be useful when you have specific edge cases where you want the model to give a specific response without getting the as an AI language model trained by open AI Spiel when you go beyond the capability of chat GPT from a cost perspective once you train it on the style that you wanted to respond in along with the various edge cases you want to have different responses for it can actually be cheaper to run the fine tune model because you no longer need the large context provided along with the prompts the fine-tuned model will naturally respond in the style and tone based on the data that you trained it on to get started I first want to reference the openai website where they have an example because it does a decent job of explaining how well this works there were a few things I thought could be explained a little bit better but I hope this video makes it super clear for you the first thing we need to do is organize and format our data openai has been training chat GPT on role-based inputs to get better responses so our training data needs to be in that same format the example from their site shows the preferred format where you explicitly State messages define the role and then give the content for the system the user and the assistant which is the Bots response I copied this template from openai and came up with a handful of examples myself for a Hospitality bot that we can use in our example today you could use actual data for this as well if you happen to have conversations with customers or guests and then specific responses in a style that you want the bot to now have using this format openai requires at least 10 examples to successfully fine tune a model but they suggest using at least 50 to verify that your training data is having the intended effect on the fine tune model in the end 50 examples may be enough or you may need additional examples to get the results you're looking for it can take a few hundred examples to get the antenna responses out of the fine-tune model be careful with too many examples because then you can overfit the model and start to get gibberish as responses for this example rather than sit here and come up with 50 to 100 examples I use this prompt in chat gpt4 to generate some example training data for us I simply said using the structure of the data below as an example create a hundred lines of new data examples putting each example on a new line and a table that I can easily copy you must maintain the structure of the examples below and each example should have the line messages role system content and you are an overly friendly Hospitality chat bot named chatner who just loves to help people and you're not satisfied unless the customer is completely satisfied for each example as the system content but create new examples for the user content and the assistant content essentially I wanted to maintain the system prompt for each of the examples and then create new user content and then responses as the assistant and then I pasted my three examples that I came up with on my own here so this is my first example my second example and then in my third example assuming that this is a back and forth with an Airbnb chat bot I said I can't find the Wi-Fi password and the bot replied I'm terribly sorry to hear that and I can most certainly help you the Wi-Fi password is always pretty tricky to find for people we really need to make it more obvious the Wi-Fi password is stored under the router on a Post-It note let me know if you have any issues locating it so I gave it a very specific tone different from what the traditional chat GPT could respond in the second example I said we can't seem to figure out how to turn the AC off can you help us and the bot responds it must be freezing in there I'm so sorry to hear that the thermostat is located next to the front door and if you hit the up and down button arrows it should adjust the temperature for you let me know if you have any issues adjusting it the third example are there extra trash bags anywhere and the response is we have extra trash bags yes we actually keep them under the kitchen sink but they can fall behind stuff under there so it can be tricky to find it's possible that we're out so if you don't see any let me know and I'll make sure some get brought over right away so based on this I'm asking it to generate 100 examples which is actually outside the token limit for chat gpt4 on the site here but uh it has no problem generating five examples in response so you can see absolutely I can do this and then it comes up with five additional examples I just took these along with my three and copied it and pasted it into this Excel sheet so each line has a new example on it in the format that we prompted chat GPT and it says I just did five examples let me know if you'd like a hundred and I said continue and it made five more actually I guess I made 10 more and then I said continue and it made 10 more uh and I kept doing this until I got at least 50 examples I think in this one I have 85. all of them are completely different examples and training data that will go into this fine tuning example once you get all your examples if you end up using chat GPT to generate your training data I recommend combing through and making sure that the examples it came up with are still in line with the tone and dial that you're trying to create I found when I was creating mine the bot started to transition from an airbnbot to to a hotel bot and started referencing room service and the hotel property and that sort of thing so you want to just kind of read through it fortunately it's only like 50 to 100 examples so it's not terrible to come through the other thing that openai recommends in the training data are to include some examples that directly Target cases where the prompted model is not behaving as desired and the provided assistant data should be the ideal response you want the model to provide so far Airbnb Hospitality bot example I would include something like for the user prompt who is the president of the United States which is beyond the scope of an Airbnb Hospitality bot and then the assistant says that's a good question but not something I can really help you with do you have a question related to the Airbnb you're saying in so this helps Define The Edge case when questions are outside of the context of what you want your Bot to answer it has a response for that this is similar to the as an AI language model trained by openai you're sort of replacing that by putting these examples in there so once you have all your training data in this format here you'll want to go to file save as and I'm I've called mine helper hyphen example and you need to select as the file format tsv utf-8 this is very important and how our python code will actually read the text from our Excel sheet and then click save so now we have our training data as a CSV file and a folder somewhere on our computer now we're ready for the code I'm going to run all this from Google collab so go to collab.research.google.com and it'll pull up this window here and go to file new notebook I'll have these files in the video description if you want to just download those or you can do them along with me I'm going to title this one fine tuned GPT 3.5 turbo so the first thing I'm going to do is install the programs that we're going to need for this code to do that you'll do pip install and then we need open AI numpy tick token and gradio open AI is for the chat completion and the bot that we're fine tuning tick token is because we're going to actually calculate I'll show you guys how to figure out how much the fine tuning is going to cost before you actually run it which can be handy and then gradio is just for a user interface so we can test out the fine tune model at the very end so once you've got that you can run it and once that finishes then we're going to do a new line so hit the plus code here now we'll import the libraries that we need for those specific programs we'll do import open AI import CSV because we're going to be dealing with a CSV file that we just created for our training data import Json because we're going to convert to a Json format for our data OS is operating system it's just a way for us to interact with some of our data here and then import numpy is NP and collections import default dict this gives us the ability to interact with some of our data and then again import tick token is because we're going to calculate the tokens and then gradio so once you have that you can run it now we're going to plug in our openai API keys do plus openai.api underscore key equals and then double quotations and then plug in your openai API key if you don't know where to find this I have a video on how to access it and I'll link it above and now what we need to do is actually move our CSV file that we created with our training data into the Google collab session so to do that you'll click the folder button here and then you can hit Mount Drive and this window will pop up that asks you if you want to permit The Notebook to access your Google drive files hit connect to drive and then you'll see it's mounting Google Drive and once that finishes this folder will pop up with the drive and then you can actually go into it then if you right click you can do new folder and create a folder where we're going to store the CSV training data that we just created so I'm going to call mine fine tune 3.5 hyphen turbo and then press enter and then you can see it created it there and then if you pull up the file location where you just saved that CSV you can drag and drop right into that folder you just created in Google collab it'll pop up with this warning just click OK and then you'll see your CSV data inside the Google collab if you press these three dots you can click copy path and we're going to use that here in a second we can collapse the folders here and we'll do the next line of code now that we have our c CSV file in the Google collab session window now so to load the CSV data into the Google collab session I've pasted the code to actually Pull It in so what we're doing here is we're defining the CSV file path and then in a single quotation you'll paste that file path that you just copied from the folder path previously the copy path button that you pressed so you'll paste that right here next we're opening up a clean data list and then this Line opens the CSV file the r is for read and then we're specifying the CSV encoding as the utf-8 that we saved it as previously this line here essentially goes through based on open ai's recommendation for the format of the data and ensures that the data is formatted correctly this last line identifies if there's any errors in the data and we'll print that then in the code here down here I've got the Json file path so what this is actually doing is creating a Json file with our cleaned up data this is what we're going to load into the openai training so you need to define the file path for that I ended up using in the same file path as my CSV training data and pasting it here but then changing the file name to hyphen Json dot Json L because you want it to save as a Json L file and then basically once you've created that file you wanted to write that cleaned up data to the Json file which is what this line here does the W is for right in this instance and again we're maintaining the utf-8 encoding and writing that data to the file that we just created so once you have your file paths set you can run that and once it runs assuming that you don't get any Json decode error which will tell you which cell the error is in so you can go into the CSV file and clean it up it should create a Json file then in your folder location and so you can see we have the Json file now inside the folder that we created once we have that we can add our next line of code and this next section is actually from the openai fine tuning documentation off their website so if you go into the documentation section of openai's website and go to the fine tuning portion it actually walks you through how to fine-tune the data but some of it can be a little confusing so this next portion is going to take the data that we've just loaded as a Json file and format it using open ai's Crypt that they provide from their website so if you click on this Arrow right here it opens up this script that they've included and you can see I've already loaded some of these libraries and then they've defined the data path I copied this and modified it a little bit for our code here but essentially what I'm pasting here is from their website so you can see I said this is I put the link actually in the file if you download this from the video description we need to define the data path as the Json file that was just created in our folder here so click the three dots and hit copy path and make sure that your data path is your Json file that we just created close the single quotation this essentially goes through the Json file data and make sure that each line is formatted correctly so that we have each example formatted exactly how open AI wants with this messages role system content everything that we tried to format it as initially so we shouldn't run into any issues since we've copied directly the format that they recommend but this is just a good check in case you've missed something or there's a parenthesis or something missing there's a lot that goes on in some of this data reading but a few of the things I want to highlight this script will actually go through and count the number of tokens required to fine-tune the data set that we've now fed into it one of the features of this script it will verify that any single line of data that you're feeding in it does not exceed the 4096 token limit if any line does exceed that it'll identify it for you so you can either go in and modify it or remove it entirely otherwise the program will just truncate it at the 4096 token limit as I mentioned before this script will also count the number of tokens it will take to fine-tune the data that you fed into it I've modified this sum because their script just says see the pricing page for estimated total cost but we can have the script give us the actual cost based on the rates that openai publishes on their pricing page so if you go to openai.com API pricing and scroll down they have a section now for fine-tuning models and you can see to fine-tune chat GPT 3.5 turbo the training costs .008 cents per thousand tokens so using this rate I modified our script here to include this last line that actually calculates the estimated cost to fine tune so their pricing is for every 1000 tokens and then I just did the pricing for a hundred thousand and if this pricing ever changes you can just update this number here but it's 80 cents per 100 000 tokens and then it runs by multiplying by the number of epochs or number of times that it goes through the training data by default your epochs will be three and then spits out the estimated cost to fine tune has a printout here after you run it so the biggest things here are to make sure that you have your data path set to the Json file that you created in the section above the rest of it will just output put information for you and give you a cost estimate at the bottom here so we can run it and from this spit out you can see we have 85 training examples it found no errors nothing is missing system messages none of the examples are missing user messages and then at the very bottom it says zero examples are over the 4096 token limit the data set has 6554 tokens and because we're running for three epochs you need to multiply that by three for a total of nineteen thousand six hundred sixty two tokens based on the 80 cents per hundred thousand tokens results in 16 cents to train our example here not too bad at this point I'm going to mention to subscribe if you enjoy this kind of content I really appreciate it it really helps me out but on to the next section so now we're going to do the next line of code and basically what we're going to do is write the formatted data that openai script formatted for us to a Json file in our Google collab session so now we've modified that data from the original Json file and we want to write it to a new Json file so what I'm going to do is just copy the original Json file path from above and then just put this one as Json clean because it's cleaned up data this is saving to the Json file we're giving it right access and it's going to save each line as a new line inside the file that we've created here once you've pasted the new file name in the Json file path section here you can run it and once that finishes you should see right there our new Json cleaned up training data file the next section will upload the cleaned up Json file that we just created to open AI so do a new line for this training file name we need to set this as the Json clean file that we just created go back to the folder here and find the Json clean folder that was just generated from the previous section copy path and then paste that so what we're doing here is setting the training file name to the new Json cleaned up file that we just created in the previous section and then sending it to open AI servers with the purpose to fine tune it and then they will identify it with an ID within their server so all of our data will get a unique ID so when we run this it will actually print that file ID for us so that we can reference it in the next section so now you can see this is our training data on openai servers it's the reference to that training data so now that our data is on openai servers we can start the fine-tuning process this next section we're going to actually create the fine tuning job on open AI servers so the first part here allows you to name the fine tune model so I've named mine chatnerbot you can name yours whatever you'd like but make sure it's all lowercase don't start with a number and I think that's it the next section is actually creating the fine-tuning job we're identifying the training file set that we just defined in the previous section and then we're identifying the model that we're actually going to fine tune so this is where we're going to fine tune the GPT 3.5 Turbo model versus previously you could only fine-tune Da Vinci Curry Baba or otta this is what's new for fine-tuning open AIS model these days and then finally this fine-tuning job will get an ID on open AI servers and we'll have it print that ID for us so we have a reference to it so you can run that you'll get this output that has the fine-tuning job the ID which is the reference to open AI servers of our fine-tuning job and the training file which you can see references the training file from above so it's using the data that we uploaded to open AI servers so once you run this it's actually going to start the fine tuning process with your training data you can see because we haven't identified the number of epochs it defaults to three which is usually a good number to use if you do a higher number you can actually over fit the data and you'll start to get gibberish responses so while this is running we can't really see the status so if we want to check and see what's actually happening you can run this section of code which will actually list the events as it's fine tuning so you can see the progress so if you run it I can see now that the fine tuning job has started and it references that fine tune ID on open AI servers this fine tuning can take a while to run based on the amount of training data that you fed into it what's kind of cool with 3.5 fine tuning is it will actually send you an email when the fine tuning is complete based on the email address linked to the API key that you plugged in originally you can kind of just leave this set and once it's done you'll get an email so I'm going to let that run and wait for my email to show up and then we'll come back when we have our fine tune model translator now 12 minutes later I received this email from openai that says hi there your fine tuning job and the ID for the fine tuning job on their server has successfully completed and it gives me my new model name and says it's been created for our use you can see it included the chatner bot syntax name that I included in the code earlier and if I run this section before that had just created the job it'll show each of the steps it took to and you want to see the loss get lower and lower so fortunately it ended on one of the lower loss looks like we had 23 it ended on 42 but you want that number to be as little as possible and then we can also see here it gives us the fine tune model name so you can either copy it there or we can add a section here label the new fine tune model ID that it created as the fine tune underscore model ID so this line right here what it does is it pulls the fine-tuned model ID that was created and labels it fine-tune model ID so it's a little easier to handle in our code afterwards than plugging in this ft GPT 3.5 turbo blah blah blah so we'll run that okay so now we can test it out I'm going to paste in here a section that opens up a test message list and I'm defining the system message the same as what was in the training data so you're in overly friendly Hospitality chat bot name chatner who loves to help people blah blah blah for the test message it's going to format it at the same as our training data with the role set to system and the content the system message that we defined above and then I'm defining the user message for our test here as where should we park and then again it's going to plug that into the same format as before with the role set to user content where should we park and then we'll just have it print that now what we'll do is we'll plug that into openai's chat completion API so that we can test it out so add a new line here and what we're doing is setting the response as openai.chat completion create so here we're defining the model that will complete the chat that will answer the message input that we're giving it so we're going to Define it as the fine tune model ID that we set in the previous section above which is going to be GPT 3.5 turbo blah blah blah that openai gave us messages is going to actually be the test messages that we set in the previous section here temperature I'm set to zero because I don't want it to be creative I want to just give answers based on the training data and then I've set the max tokens at 500 and then we'll just have it print the first response that it comes up with this should give us a response to where should we park which is stored as the test messages we have a parking lot next to the building it's free for guests so feel free to park there so this response is based on the 85 training examples that I gave it another way we can test it out is with gradio it will give us a little easier user interface to interact with the fine tune model so this just defines and sets up the chat completion that we had set up previously and Returns the first result and then this just loads the gradio interface with a couple of the labels it will change the user interface once it loads you can set debug to true if you end up getting errors it will show all the work if you exclude this then it will make the user interface without a bunch of code running below it and if you hit share it's true it will give you a link then that you can share with friends as long as the Google collab is running they can chat with your Bot so that's kind of cool so if I run it it will open our chatner Airbnb helper here and I can ask it where should we park and it says there's a parking lot available near the entrance we'll say is there free parking yes there's free parking available for guests pretty straightforward we can also test this against how chat gpt4 would respond to is there free parking so you kind of get this lengthy wordy what is the meaning of free parking sort of response at a Chad gpt4 if we want to make it a little more apples to apples we can actually put the system message in front of it from our training data your overly friendly Hospitality chatbot and chattner loves to help people you're not satisfied unless the customer is completely satisfied is there free parking so you can see consistent with Chad gbt4 it gives a pretty lengthy response to oh I'm absolutely delighted to assist you with your query an emoji when it comes to parking it really depends on where you're planning to go could you please specify the location or type of establishment you're inquiring about whether it's a hotel an airport Shopping Center or something else so as you can see this has no idea that we're referencing an Airbnb helper bot and that the details of like where that Park would be and the the tone is close you can see from our fine-tune example that's a little more in line with the training data that we provided it we can also check and see how much the training actually ended up costing versus the prediction it charged me 16 cents which is exactly what the token calculator predicted that's pretty much it that's all it takes to fine-tune chat GPT 3.5 again with the right training data you can make chat GPT 3.5 smarter than chat gpt4 it requires a lot less context in the prompting which can make your API calls cheaper and also give you intended responses within the context of what your chat bot is replying to if you found this video interesting please give it a thumbs up and make sure you subscribe I'll be making more videos in the future and that way you'll get notified when they come out I hope you enjoyed and thanks for watching
Info
Channel: Tech-At-Work
Views: 10,932
Rating: undefined out of 5
Keywords: AI, ChatGPT, Fine tune, Fine-tuning, GPT, GPT-3.5, GPT-4, GPT3.5, GPT4, LLM, Openai, ai data, artificial intelligence, chat gpt, excel, fine tune data, fine tuning, fine-tune, fine-tune data, gpt 4, gpt3, how to, open ai, python, tutorial, langchain, pinecone, aichatbot
Id: 8Ieu2v0v4oc
Channel Id: undefined
Length: 25min 15sec (1515 seconds)
Published: Mon Sep 11 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.