ChatGPT Fine-Tuning: The Next Big Thing!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so this is huge starting today you can fine tune chat GPT on your own data sets it's something that people have been waiting for for a while in this video we'll see what benefits you can get when you fine-tune chat GPT on your own data set we will also look at the pricing structure and I'll walk you through a code example of how you can fine tune a model on your own data set and at the end of the video we will also explore why it may not be a good option for most of the people so what exactly do you get with fine tuning when you fine tune a model you will get improved sterability which basically means the model will be able to follow instructions better you will also get reliable output formatting so basically the model will stick to the output format in which you want the output to be and with fine tuning you will be able to customize the tone of the model a few things that they have highlighted in their blog post first when you fine tune a model you will be able to use shorter prompts to get better performance also compared to gpd3 the GPT 3.5 can handle 4000 tokens and from their test it seems like you can reduce the prompt size by up to 90 percent by fine tuning instructions into the model itself thus speeding up the API calls and cutting costs no the beauty is when you combine fine tuning with other approaches such as prompt engineering information retrieval and function calling that's what's make these llms a lot more powerful next before looking at a fine tuning example let's look at the pricing now they have divided the pricing into two different parts one is the initial training cost and the second portion is the usage cost so for training it's going to cost you 0.008 dollars per thousand tokens now for input prompts it's going to cost you 0.012 dollars per thousand tokens and for the output usage it's going to cost you around 0.016 dollars per thousand tokens so the at the end they have provided a very simplified example for example if you have a training job that that's around 100 000 tokens that will cost you around 2.4 dollars to train the model now we're going to come back to this pricing again at the end of the video because there is a lot to unpack here now let's look at how exactly you can do the fine tuning so in the blog post they have provided this this schema so first you need to prepare your data set and the data set is supposed to have a system message then input from the user then response from the assistant that's how your data set is supposed to look like then you can make an API call to open AI to upload the files then create a training job through an API and then you can reuse the model through an API as well but we are going to look at a more concrete python example so open AI has a very nice fine tuning guide on their website so here they give you the reasons of why you would want to fine tune for example a fine tuning will result in higher quality results than prompting ability to train on more examples then you can fit in a single prompt and token savings due to Shorter problems lower latency right it's a very detailed guide and I would recommend everybody to go over this if you are really trying to understand fine tuning open AI models but let's say if you want to fine tune the model on your own data set first you need to put this put your data in a proper format so for that your data is supposed to have three different fields or roles so the first one is system message uh then and so that's basically the system message that we are providing to the model then the user input so here we are defining the role as user and the content is the input prompt from the user and then you're going to have response from the assistant right so basically you will arrange all your examples in a simple Json file and that Json file is going to be used by the model for training now once your data is in the proper format you need to upload the Json file that you just created to open AI API now once you're able to create your data set the next step is going to be to upload your data set to open AI so for that we're going to be using the python code this is the open AI API package then you will provide the file name so let's assume the data is stored within this my data.json file and then you provide the purpose so in this specific case we want to fine tune our model now after doing this we need to Simply start the fine tuning job and for that we need to provide our open AI API key then you simply make a call to the fine tune job this will create a job and it will start fine-tuning now you need to provide two additional inputs first is going to be the training file this is basically the model name that's going to be used for the trained model and then you need to provide the name of the base model so if you want to fine tune GPT 3.5 turbo you need to provide that name in here what happens when you fine-tune the model and you want to use it it's as simple as using the chat completion API from open AI so here you simply need to provide the model name that you assign to your fine tune model then the system message and then the user input and you will get the assistant response or the model response as output so this is very simple to fine tune GPT 3.5 using the open AI API I'm going to create a more detailed video on how to fine tune this on your own data set we are going to look at how to structure the data set and then we'll make the API call to get a fine tune model so stay tuned for that video now let's look at some not so great things which might deter some people from using this service okay so the first one is the safety feature so your training data is passed through their moderation API and then a gpt4 powered moderation system to detect unsafe training data that conflicts with the safety standards of openai so basically you are limited by the standards of data now the second thing to consider is the price itself because the fine-tuned GPT 3.5 turbo model is a lot more expensive compared to the vanilla GPT 3.5 turbo model just as an example for input tokens it's about eight times more expensive compared to if you're using the four K context window GPT 3.5 turbo model now for the output tokens it's about 5.3 times more expensive compared to the original 3.5 now compared to gpt4 it's still less expensive but probably the performance is not going to be as great as gpt4 so if you're fine-tuning a model you really need to consider this substantial increase in price it's a very interesting development and it will be interesting to see what people can make on top of it although we'll still have to see whether it can actually give a performance boost in your applications or not and is that performance boost worth the substantially increase in price let me know what you think in the comment section below I will be making a more detailed video on how to fine tune this model on your own data set if you found the content useful consider liking the video and subscribe to the channel thanks for watching and see you in the next one
Info
Channel: Prompt Engineering
Views: 24,841
Rating: undefined out of 5
Keywords: prompt engineering, Prompt Engineer, finetune llama on custom dataset, how to finetune llm, llama v2 instruction finetuning, llama finetuning, llama fine tuning, llama v2 finetuning, llm training custom dataset, llm finetuning, how to train llm, autotrain llm, autotrain llm training, fine tune OpenAI, ChatGPT fine tune, train ChatGPT on your own dataset, how to train chatgpt on your data, openai model training
Id: _THApvyj4S0
Channel Id: undefined
Length: 8min 27sec (507 seconds)
Published: Wed Aug 23 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.