Prompt Engineering with OpenAI's GPT-3 and other LLMs

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

approach engineering is a emerging discipline within the world of generative Ai and it describes the art of writing good intentional prompts that produce an output from a generative AI model that we actually want and to a degree it is an R it's it's very hard to explain how to create a good product but to a larger extent there's a very logical process and way that we can go into creating problems that can be described and easily applied to produce better output from large language models and of course the generative art tools as well good prompts are the key to producing good outputs for these models using different types of prompts we can modify the mode or type of task that has been formed and we can even use prompts to train models to some degree and the performance of doing that is actually surprisingly good now there's a few things to learn with prompt engineering and I think one of the best ways to maybe think about this discipline is to think of it as a more abstract version of programming so throughout the last decades we've seen programming languages become more and more abstract prompts for AI models is almost like the next step it's a super abstract programming of an AI model and that's exactly how I want to approach this here I want to discuss prompts and building good prompts of different parts of a prompt and how we apply them to large language models now when we think of large zones models there are a lot of different use cases that they are used for we see things like creative writing question answering text summarization a data extraction like a ton of the completely different things and with each of these different tasks we're not actually doing anything different in terms of the model the models are all the same for each one of these tasks the difference is the prompts themselves now what do these prompts look like well we can typically break them apart into a few components so we have the instructions of a prompt any external information or we can also call these contacts quite commonly as well we would also have the user input or a query and we can also Prime our problems with what we call an output indicator this is usually just a little word at the end now not all problems require all of these components but often a good prompt will use one or more of them so starting with instructions instructions tell the model what to do and this is a very key part of more instruction base models like opening eyes text DaVinci 003 and through these instructions we try to Define what we would like the model to do and that means how it should use some inputs and how it should format outputs and what you should consider whilst it's going through that process and we would always put these instructions at the very top of our prompt I'll explain a little bit more about that pretty soon following this we have our external information or context and these are additional parts of information that we feed into the model via the prompt these can be things that we manually insert into the prompt information that we pull in through long-term memory component affected database or we get it through other means like a web search API or a calculator API something along those lines following that we have our user input now that's pretty obvious it's just the input from a particular user it depends on what you're doing like if you have a text summarization use case they might input a two-page chunk of text and we might want to summarize that into a paragraph or on the other side maybe it's a question answering tool and in that case the user might just type in a few words and question mark and and that is a question that is a user input so of course that can vary as well then finally we have our output indicator this is essentially the start of what we would like the model to begin generating so it is it's kind of like a wave indicating to the model hey okay now it's time for you to start you know writing something and I want you to start writing something based on this first little chunk of text so a good example or very clear example at least in my view is when you have code generation model you want to generate python code you know you give it instructions to do so then your output indicator will just be the word import all in lower case because most Python scripts will actually begin with the word import because you're going to be importing your libraries like import numpy and so on on the other hand if you were building like a conversation or chat Bots this output indicator might be like the name of the chat bot followed by a colon is If you're sort of in a chat log okay so they're the four main components of a prompt that we're going to talk about and that we're going to actually use to construct our problems throughout this video okay so let's have a look at an example here we have the start of our prompt up here this is the instruction okay so that's right at the top of the prompt answer the question based on the context below if the question cannot be answered using the information provided answer with I don't know okay so this is a form of conservative q a so given a question a user question we want the model to answer based only on information that we can verify okay and that verified information is something that is our external information here our context that is also fed into that want right and we're saying if that information is not contained within this context here or we'll probably usually have a list of contexts and we need to say I don't know in this case if the model answers or makes something up it can lead on to pretty bad results so we really don't want it to make anything up because they they tend to do that pretty often so we have our instructions we have our external information or the context then we have to use a query which is down here okay so the question which libraries and model providers offer llms and then we have that final words that output indicator so this is like okay now you can start answering the question now this is a pretty good prompt it's clear we have our instructions we have some more external information we have a question we have the output indicator at the end okay so let's have a look at how we will actually Implement these things so we're going to work through this notebook here if you'd like like to follow along and run the notebook yourself you can do there'll be a link in the top of the video right now and also in the video description the first thing we need to do is PIP install the everything AI Library initially that's the only Library we need there'll be another one a little bit later on which I'll explain when we get to it and we'll come down to the first code block we see we have this prompted this is the same one I just went through so I'm just going to run that and then initialize my openai instance using my open eye API key so if you need this you can get it from here which is just at this so beta.openai.com account API Keys which you can just access you log into your account you create a new secret key and then just copy that into the notebook okay so once you have authenticated with that we're going to generate from that problem that we just created okay so all we do is we go to completion endpoint create we're using the text DaVinci 003 model it's one of the most recent instructional models and then we print out the response which will be in this path of the the Jason returned now here you can see that it stops pretty like suddenly here now the reason for that is because our Max total length isn't very long and we'll explain that a little bit more later on but what we first need to do is just increase that length right now and what we're going to do is just set that to 256 so max tokens equals two five six okay and let's just see if that does answer the question which libraries are modified as offer large language models and that's exactly right okay so we have hooking face open Ai and cohere right and then alternatively if we do not have the correct information within the context the model should apply I don't know because we have this a little bit here so I'm just going to put in the context libraries a Place full of boats and we would actually hopefully assume that the model is going to put output I don't know now let me just copy the max tokens again put that in here and we see that it follows that instructions it says I don't know okay great so that's just a simple prompt now what's next let's come down here and what we're going to talk about is the temperature within our completion endpoint so we can think of the temperature parameter as telling us how random the model can be or how creative the model can be and it simply represents the probability of the model to choose a word that is not actually the first choice of the model and this works because when the model is predicting tokens or words it is actually assigning a probability distribution over all possible words or tokens that it could output so you know say we have all of these different words here or tokens I should say and there are hundreds of thousands of these it's not six of these but what we're essentially doing is the model is going through these and it's kind of assigning a probability distribution so maybe this one is a high probability this one's low again this one's kind of big and this one up here is the most likely one right so this is the probability here and in this case if we have the temperature set to zero the word that is going to be chosen is this one here okay because there's there's no Randomness in the model it's just going to always choose the highest probability token If instead we turn the temperature up to one there is a lot more Randomness it may still choose this token here because it has a highest probability but it will also consider this because there's still a decent probability there okay and to a lesser extent it will also consider this but here this token to a lesser degree this one to a lesser degree this one and so on right so by increasing the temperature we increase the the weighting almost of these other possible tokens as being the select tokens within the generation of the model and this will generally lead to more creative or kind of random outputs so considering if we have our conservative fat-based q a we might actually want to turn the temperature down okay more towards the zero because we don't want the models to make anything up we want it to be not creative and just factual whereas if the idea is we want to produce some creative writing or some interesting chatbot conversations then we might turn the temperature up because it will usually produce something more entertaining and interesting and to some degree kind of surprising to see in a good way so let's take a look at what that might look like so here we're going to create a conversation with an amusing chatbot so again I want to just set the I need to add in the match tokens here okay and I'll add that in so we're going to start and the temperature is going to be very low so the default is is one with the opening AI completion endpoint running status is zero so the following is a conversation with a friendly chat about the chatbers responses are amusing and entertaining now this is like the instructions okay and then down here we have the starter conversation we have the user's input here and then we have our our next output indicator so let's run this and we get this so oh just hanging out and having a good time what about you you know it's not it's pretty predictable it's not that interesting yeah it's definitely not funny it's not amusing or or insane now let's ruin this again last time I got a good answer to this but it doesn't always provide a good answer so you know we can try so let me put in the match tokens again it's a little more interesting hanging out my electronic friends it's always a good time a bit better hang out come templating the meaning of life okay so a few better answers I don't think any of these are as good as the first one I got but they're not bad and definitely much better than the you know the first one we have here which is it's just a bit kind of plain and and boring so let's move on to what we would call few shot training for our model now what we'll often find is that sometimes these models don't quite get what we are looking for and we can actually see that in this example here so the following is conversation similar thing again AI assistant is sarcastic witty producing creative and funny responses to the users questions here are some examples and then in here what we can do is actually put in some examples but before we do that I just want to remove that and I want to show you what we get to begin with so let's run this we've turned the temperature up so it should come up with something kind of creative to a degree but we'll see that it's not not particularly interesting okay so yeah maybe if you want a serious answer this is what you're looking for but I'm I'm not asking for anything seriously I wanted something sarcastic witty creative and amusing right so what if we come down here and we actually add a few examples to our prompt okay so this is what we would refer to as few shop training we're adding a few essentially training examples into the prompt so we're going to say okay user how you ai's just kind of being sarcastic I can't blame sometimes I still do the user will ask what times it and the AI says it's time to get a watch okay and let's see if we get a a less serious answer to what is the meaning of life again let me put in the most tokens uh the previous answer was pretty good I don't know if we're going to get a good one like that again but let's try okay and you know we get something we get something good again as a great philosopher Shrek once said Fiona demeaning in life is to find your passion so kind of useful but also pretty amusing so this is a much better response and we got that by just providing a few examples beforehand so we did some few shot training we showed a few training at samudson model and all of a sudden it can produce a much better output now next thing I want to talk about is adding multiple context or adding a context in the maybe I think it was the first example we had a context in there but we manually wrote that okay we added that in there in reality what we do is something slightly different so let's consider the use case of question answering question answering we want the model to be factual we don't want it to make things up and ideally we would also like the model to be able to Source where it's kind of getting this information from so what we essentially want here is some form of external source of information that we can feed into the model and we can also used to kind of fact check that what the model is saying is actually true or is at least coming from somewhere that is reliable now when we are feeding this type of information into a model via the prompt we would refer to as Source knowledge and Source knowledge which my aspect is just any type of knowledge that is fed into the model via the prompt and this is kind of the no I don't want to say the opposite but this is an alternative to what we would call parametric knowledge which is knowledge that the model has learned during the training process and soars within the model weights themselves so that seems like if you ask the model who is the first man on the moon it's probably going to be able to say Neil Armstrong because it's kind of remembered that from during its training when it's seen you know tons and tons of human information and data but if you ask more specific pointed questions it can some times make things up or can provide an answer which is kind of generic and not actually that useful and that's where we would like to use this Source knowledge to feed in more useful information now in this example we're just going to feed in a list of dummy external informations so in reality we'd probably use like a search engine API or a long-term memory component rather than just relying on a list like we're doing here but for the sake of Simplicity this is all we're going to do so we have a few contacts here it's talking about large language models the latest models use NLP so on and so on it also talks about getting your API key from open AI talks about the open ai's API accessible via the open AI Library awesome Blitzer code and down here we also talk about setting it via the line chain library now it's going to use all this information and it's going to use all that to build a better prompt and create a better out outputs so what we do is we have our instructions at the top here as we did before then we have our external information our context now for gpt3 in particular what we recommend is that you separate your external information or your context from the rest of the prompt using like three of these or also you can use three of these not like that but like this okay we're gonna stick with these and then when it comes to the prompts themselves you also separate them each one of the unique prompts so you know we have one here one here and so on we're separating each one of those with just two of these characters as well then we have our question and then we have our output indicator now let me actually just copy this I'm going to put it up here so we can just see what we're actually building here uh oh we need to run this context here now let's run this again okay and I need to I need to print it otherwise it's a mess Okay cool so it's so kind of messy but it it works and also points out to me that I've missed a little bit so we have the instructions here and then we have our separator and then we have our context now actually here I've got this the thing to separate them but that's not exactly what we want we actually want to have some new line characters in there as well so add those and and to actually do that we're going to need to separate this bit as well so put context string in there and then we also here just put in the contextreme directly so come to here and then we get this sort of nicer format just by the way it will work even if you don't have this nice format but we should try and format it like this so we have our context and each one of them is separated okay and then you go down you have a question and the answer at the end there that is what we want I'm going to replace this with the context string and then if we come down to here I'm also going to add in our match tokens great and let's run that Okay cool so the question we'll just go up to the top here answer the question based on the context below answer I don't know if it doesn't know the answer same as before give me two examples of how to use open ai's gpt3 model using python from start to finish okay so what we have is two options we can either use it via open AI or we can go via line chain both views are correct now one question here is okay we added in each context but actually did it need those can we do this prompt without context and still get a decent answer we can try so answer the question and uh we don't have any context here same question same output indicator let's run that oh one thing we all need is the max tokens again okay and yeah we get this so using open aisg T3 model with python to generate text you know yeah that is true but it's not it's not very useful and then here it's saying using Jeep G3 to generate images which isn't even possible so yeah not not really what you know not a good answer essentially so this is you know where we would see our certain information that Source knowledge has actually been pretty useful now considering how big our prompts got with those contexts and you know that wasn't even that much uh information being fed into I promise how big is too big like at what point are pumps too large for us to actually use the model this is a pretty obviously important thing because if we go too big we're going to start throwing errors so let's begin taking a look at that so for types of Industry 003 the what we call the context window which is the maximum number of tokens that Texas Avengers 003 can handle within both the prompt and also the completion creation that is [Music] 4097 tokens okay not words but tokens so we can set the maximum completion length of our model as we saw before we're setting the max tokens we can set that but that cannot cover the 4097 because we need to consider the number of tokens within our prompt as well the only problem is how do measure the number of tokens within our problems okay so for that we need to use open ai's token tokenizer right so for that you'll need to pip install tick token taking the LA prompt and we'll just stick with the the one where I haven't added in the the new lines here let's take a look at how we can actually measure the number of tokens in there so we need to import tick token we create our prompt hopefully better than using us here we have our encoder name which I will explain in a moment and then we have our tokenizer so this is a token tokenizer which we initialize by using tip toe and get a coding then we have that code name and code name is important for each model here and then we just tokenize it encode our prompt check the length of that and we get 412. so we've even respond into our text message003 model it's going to use a 412 of our context window tokens okay which is at 4 000 and 97 that you see here that leaves us with 3685 tokens for the completion okay for that so that Max tokens we can set it to that number but no higher now one important thing to note here is that not all open air models use the p50k base encoder there is this link here which will take you through to a table of the different encoders but in short as I've recorded this video they were this okay so for most gpt3 models and also GT2 you're going to be using the r50k bass or gpt2 encoder for the code models and the recent instructional models we use p50k bass and for the embedding model 1002 we use CL 100K base so if we now set our maximum tokens to this and generate let's let's take a look what we get now one thing you will know straight away is that the completion time is is longer it's because we have more maximum tokens even if it doesn't fill that entire space with with tokens the computational time does take longer now let's see what we have here um it's pretty much the same as what we got before even though we've increased the number of tokens because it doesn't need to use all of that space so the model doesn't now what happens if we increase Max tokens by one more right so if we go to you can kind of see already if we go to 3686 let's run that and we get this invalid request error that's because we've exceeded that maximum content level so we need to just be very cautious with that and obviously like if you're using this in a environment where you might expect the maximum content length to be exceeded you should probably consider implementing something like this check into your pipeline now that is actually everything for this video we've just actually been through Fair few things when it comes to building better prompts and how to handle some of the key parameters within your prompt completion inputs so as I mentioned prompts are super important and if you don't get them right your output is is not going to be is that great to be honest so it's worth spending some time just learning more about proms not just what we've covered here but you know maybe more and just especially more than anything experimenting with your prompts and considering other things depending on what you're doing like do you need to be pulling in more information for your prompts from external source of information do you need to be modifying the temperature are there other variables in the completion endpoint that you could do modifying all these things are pretty important to consider if you're actually building something with any real value using large language models and no other thing to point out is that this is not specific to GPT 3 models this is if you want to use cohes generation endpoints or completion endpoints or you want to use open source hooking face models like you want to use bloom or something for completion you should also consider these Branch engineering Like rules of thumb or trips beyond that I think I've covered everything so that's it for this video I hope it's been useful and interesting but for now we'll leave it there so thank you very much for watching and I will see you again in the next one bye

Info

Channel: James Briggs

Views: 11,935

Rating: undefined out of 5

Keywords: python, machine learning, artificial intelligence, natural language processing, nlp, Huggingface, semantic search, similarity search, gpt 3, gpt 4, gpt chat, chat gpt, prompt engineering, prompt engineering tutorial, prompt engineering gpt 3, prompt engineering 101, prompt engineering course, prompt engineering ai, prompt engineering openai, openai, openai tutorial, openai gpt 3 python, llms, large language model, large language models explained, generative ai, openai tiktoken

Id: BP9fi_0XTlw

Channel Id: undefined

Length: 29min 22sec (1762 seconds)

Published: Wed Feb 01 2023