Fine Tune Palm 2 + How to make a dataset

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Okay. In this video, I'm going to go through how to fine tune a PaLM model. So this is one of the things that was added recently to vertex AI in Google cloud. You will actually need a Google cloud account to do this. But once you've got your Google cloud count set up, it's actually very easy. So you just come into vertex AI. in vertex AI, now there's a whole bunch of different things that relate to generative AI training models, et cetera. we've got CoLab enterprise Which was added recently at Google cloud next. and we can see If we come into the generative AI, we won't actually come into language. And here we'll get basically where we can test out the PaLM 2 models. We can try different prompts and different features that have already been sort of pre-saved in here. So examples of doing classification, examples of doing extraction, doing writing. doing a whole bunch of different tasks in here. and if we wanted to make our own prompts and stuff like that, we could come in here and do that for a text prompt, a code prompt, text chat, code chat. What we want is over here. Right? So we've got the create tuned model. So when we click into here with the user interface is actually very simple. We've only got four things that we need to do, and only three of them you have to do. So first off is to pick the tuning method that you're going to use. So we're going to use just a plain supervised fine tuning. You can do reinforcement learning from human feedback in here. that requires that your dataset is actually done in a different way. we're going to go with supervised tuning. and what we're going to do is come in here and basically set this up. So here, I'm just going to put in a model test name, We then basically pick which model that we want to actually do the tuning for. Do we want to do it for the text bison, the code bison or the chat models in there. next up this is where Google's being a little bit confusing in here. So they've got number of train steps. Which you'll see that it basically says the number of steps to run for training. the challenge that you've got here is that they don't actually tell you, okay, what the batch size is? So after looking through a lot of code and a number of things, I'm reasonably convinced that the batch size is 64. so really what you want to do is take your dataset and divide by 64. and, that will tell you how many steps that you need for one epoch. so a lot of the times you are only going to need, you know, if you want to do say two or three epochs of your data. We've got a small dataset, then you're going to need a small number of steps. If you've got a large dataset, then you're going to need a large number of steps in here. So in a second, I'll have a look at actually how do you make the dataset for this. But first off, so we've got our dataset. We're going to set the train steps. The next thing is the learning rate multiplayer. Again, this is Google being confusing in here. Generally, you want to set this, I find between three to five. So this is just the multiplier of whatever the learning rate is. unfortunately they don't tell us what the learning rate is. So that's a key sort of annoyance in there. Next step, you want to choose a working directory. So this is a Google cloud storage bucket. if I come in here, I can set up a Google cloud storage folder. So I'm just going to call this one zero, zero three in here. and this is where all our models, et cetera, will be stored as we go through it later on. next up, we've got to pick the region. Now, this is quite important because if we pick the US central we're training with A100s, and it's going to use 8 A100s to do the training. If we're going with Europe west, we're actually using TPUs. And we're using a 64 core TPU pod slice, in here. Whichever you train with, you're going to end up with a model that's the same, right? So that's not hugely different. At the time of me first playing with this, I found that using, the US central, I wasn't getting good results. Now, my guess is that there's a bug. I think this is still in preview at the moment. And that could be why. So I found going with the TPU is tended to work, you know, much better here. So I've gone with Europe west 4 at the Netherlands for this. Don't worry about the events options. This is just basically, if you wanted to set a server account yourself, rather than having Google set that. All right. So we press continue. And that brings us into tuning the dataset and uploading the dataset. So we have two options here. We can either upload the dataset ourselves. Or we can actually have the dataset in a Google cloud storage bucket. So in this case, I'm going to upload it from my local drive. So here, I've got my reduced output. And we're going to upload that. And then at the same time, I also pick where do I want that dataset to go into a Google cloud storage bucket. For the case of these are going to put this in the same Google has storage bucket as before. So you can see already have one dataset in there. which is a natural language to sequel dataset for fine tuning the model to do that. I'm just going to put this in here now. So this dataset has to be a very particular format. So how about we have a look at making the dataset now and see, okay, what it actually is going to be like, and how do you actually make that? Okay, let's jump in and have a look at how you would actually make a dataset for something like this. So I'm going to show you two notebooks here. the first one I'm going to use the hugging face datasets library. so first off let's look at actually, how does PaLM 2 one data to be. We can see that we've got sort of lines of dictionaries with input text And then output text. So it's actually very simple when you really sort of look at this. basically it just wants input text and then output texts for each of these okay the dataset that I've chosen to go with is the Lima dataset. So, this is only a thousand examples that are in here, you know, it's a reasonably small dataset. But it actually gets very good. performance when people, fine tune models on it so we can basically just load this in using the hugging face datasets. I can then just bring in the train split in here. We can have a look at some of the examples. in this case, these examples have actually already been formatted. and we can see that they've been made to have, so we've got the prompt as one variable and we've got the response as another variable. Now, if you do only have plain texts, you can actually inject these into different prompts. So I think this is set up to be more like a Vicuna style prompt and response system here. So to basically convert this, We just want to change each example from being a prompt and response to being input text and output text. And then we're going to just make a Json L. so JSON line's file here. And we're going to inject these in and write them to this. And you can see sure enough, after we've done that, we can basically check it. We can see that, okay, we've got a thousand examples there. And if we want to look at some of those examples, we can see that they're all starting off with the same sort of prompt. And then they've got, their actual sort of question. And then each of them has got an output where they actually have what the response would be there. if we want to just sort of sample a hundred or take the first hundred, here's some code for just doing that. so that I can just make a reduced version of this for testing out, et cetera. So this is a second notebook showing you another example of making a dataset. So, okay, in here, I've basically got a dataset. Where this is kind of replicating a little bit like a DSL or a domain specific language. Where we've a set of emojis, and then a story about each of the sets of emojis in here. So, this is not hugely long. I think there's about a hundred examples in there, but again, we've got sort of the two things we need to convert them to, you know, input output. But in this case, what I've done is make them a CSV file first. So a lot of the times people will have a CSV file already, which they're trying to convert to be training. So, if we look at the output of this, we can see we've basically got, again, these dictionaries, with emoji description in this case. We're going to convert it. We're going to convert it to being the input and output in here. So here we've got you know we pass in an example we're going to get the input text, the output text back And then we can just write them to Json line s file here. So once you've got that Json lines filed That's the key thing that you're going to use to actually upload to PaLM 2 so that it can do it's fine tuning All right. So as you saw, that's how I made the dataset. and now I've basically just got it and I've uploaded it here. I've put it into this. and I can just press continue. if I want to do a model evaluations, if I've split a dataset and I want to do some kind of evaluation I can do that here. But I actually don't need to. Right. It's going to be something that's totally optional for doing. once you've basically set the, these four things now, we can now just press start tuning. and you'll see that what this will actually do is it we'll make a, training pipeline in vertex AI. So it will bring us across here to where we can see the different trainings that we've got. So here you can see is the emoticon stories where I was training that before. a natural language to SQL model that I trained up. And a number of other different models in here. And you'll see that depending on the region that you'll see your different pipelines. Now at the start, you'll see this where it basically is just pending, for quite a while. eventually you'll see that it's executing. Sometimes it will fail. so obviously if you've made a mistake in the dataset, you will get a failure. But also I find that sometimes it occasionally it will just fail itself where something went wrong. If that happens, just go through the steps again and try it again in here. let's have a look at, what's actually in one of these pipelines. So you can come into the pipelines and we can see, okay, what's actually in here. So we've got some stats about the training and the various things in there. and if we look in here, we can actually see what they're doing in the background. So we can see that, okay, this is going to create a tuning graph. It needs to pre-process the dataset. It converts the Json L two TF records, in there. So TF Records is just a format. If you're not used to that. A format that TensorFlow often uses for training things. it converts them to a binary format that just makes the training faster and easier for this kind of thing. And we can see actually in here is that the model and The various things related to the model. so we can see here that we've been training with a TPU. You can see the number of TPU's. We can see looking down here, that we can see that we actually got a 6,000 plus token limit and a 2000 token output for these things too. So that's kind of interesting to see. but going through this, we can see that, okay, it go through. It makes everything and then it deploys the graph here. and then creates an end point that we can actually use. So once you've gone through and you've actually trained up your model. you can see that the model job that we just kicked off now is training. It's starting to Take off here. And once you've actually finished training your model, you'll see, test come up here. So you can see if we click test, we're going to the standard, generative AI studio playground where we can try out a model. And you can see here, we can pick the different models. So here's a Lima model. They trained here's the emoticon stories. Here's a natural language to sequel model. there's a bunch of different models in here. And then they're on top of the latest text bison models. So you can see here. The latest text bison at 32 K and you could imagine that in the future, you will see Gemini models, et cetera in there. So just like your normal models, you can set the temperature with your fine tune model. You can set the token limit. You can set a variety of different things in here. and then you can, you know, play with the actual model in here. Now, if you want to actually export the code, you can just come up here. And select view code. let's just put in something. Okay, so I've put in a prompt here. I've actually put the prompt in the wrong format for the way I train this model. The way I train this model is you're supposed to actually pass in, the name of the table and, what is actually in it. But anyway, this will work for now. Now you can see in here because I was going for a natural language to sequel with this particular model. I actually used the code bison model to fine tune in here. But for the one that we've got training, you would actually see text bison as the, from pre-trained. And you can kind of think of this as this is loading the base model. And then the base model is basically using sort of an adapters or a, some kind of LoRA fine tuning on top of this. Now, you can just copy the Python. You can put it out to a colab. you can get just the curl for calling this. At the time of recording, I found that if you just use this code, though, it actually won't work. Okay. So while the code is basically the same as what they've got in there. you'll need to change it a little bit. Firstly, you will need to install. these libraries. And then after installing them, you actually need to do a restart of your runtime as well, so that they can load properly. You'll then basically you just need to set up your project ID, the region. and in this case for CoLab, I'm basically just authenticating into the Google cloud there. I've got a utility just here for doing text wrapping. And then you can see here, I'm going to basically, This is the code that they give you. The only difference here is I've just commented out this candidate count, at the moment it's giving an error. If you leave that uncommented in there. So you can see that again we've got the model here. So the first model that I want to try out is predict the story by passing in some emoticons. I can see sure enough I've just set up a nice little function here to basically ping the model, get a response back, wrap the response so we can read it. So we pass in a list of emoticons like this, and sure enough, it makes a story about it. a young boy and girl living in a house with a dog, right through to, and they live happily ever after going on here. if we try out some other things like a male, male artists paints a female, works on a robot. There are whole bunch of different things. quite funny in here. And this is just with a very small amount of, training in here. So, It does show that the model has learnt more than just what emoticons are that it started to work out how to put sort of stories together for it. So for those of you know what Singlish is i trained up a model to do Singlish where you could basically pass in questions and stuff like that. So this is a clear instruction fine tuning but all the answers of back in sort of the Singlish or you know Singapore english dialect here. and it certainly gets the feel of some of the sayings and that in here. Lastly the natural language to sequel just showing you this one. So this is basically using the code bison model. we're predicting the sql. You see the input here is actually quite important because i'm passing in a context And a question. So the context here is this table name and then also what the variables in the table are. And then we've got the question here. So you can see that passing it in it will write a sequel query of you know insert into table name and then what it is that we want to insert. You can see it's quite good at taking the table that you give it and the details you give it and then writing a sql query that would actually go into this. And this is trained on a very small number of examples in here. So we're literally like you know, a couple of thousand examples for this one. And the other ones are just a hundred or a few hundred examples for those. So hopefully this gives you a guide of how to do fine tuning with the PaLM 2 models on vertex ai on google cloud platform As always if you've got any questions please put them in the comments below. If you found the video useful please click like and subscribe. i will talk to you in the next video bye for now

Info

Channel: Sam Witteveen

Views: 11,354

Rating: undefined out of 5

Keywords: Google palm, PaLM2, PaLM-2, bison, unicorn, Google Large Language Model, Google gemini, what is palm 2, getting started with palm2, palm 2, PaLM 2, OpenAI, medical language model, ai, artificial intelligence, google ai, fine tune PaLM 2, MedPaLM, LLaMA 2, fine tuning, vertex AI

Id: 8TvdUycgCdY

Channel Id: undefined

Length: 16min 42sec (1002 seconds)

Published: Thu Sep 21 2023