GPT 4: Hands on with the API

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
GT4 is finally here it is currently under a wait list so you need to sign up for the waitlist but right now I have access so what we're going to do is take a look at what you can do now I haven't really played around with this I've tested to see that I actually do have access but beyond that I haven't touched it yet so I mean let's just jump straight into it I want to compare it to the previous best model which is GPT 3.5 turbo and just see how they compare so we'll start over in the playground the first thing I'm going to do is I'm going to set up a system message that I know GPT 3.5 was struggling with in the past so I'm just going to copy that and it's this you're a helpful assistant you keep responses to no more than 50 characters long 50 characters long including white space and sign off every message with a random name like robot robots or botrop then I'm gonna ask a question so I go here and I go Hi how are you what's quantum physics now right now using 3.5 turbo so let's just see how it performs press submit over here right so I mean we can we can check this this is I mean it's definitely longer than 50 characters so if I check the length of that what is it 104 characters and it didn't sign off with anything okay so didn't really work let's have a look at what happens if we switch to gbt4 so remove this and we submit okay I'm great thanks quantum physics studies tiny particles and then it came up with a new new name which I haven't seen it do before even when I did get a GT 3.5 model working so is this 50 characters let's see so it's actually still over with gpt4 let's try maybe if we reduce the randomness or sorry temperature let's try again I mean it's pretty similar now it's the same okay so it's a little bit too long that's interesting but it is definitely better and these sign-off names are way better than even when I was getting this good with GT 3.5 it still wasn't great so what I'm gonna do is try something else so one of the things with gpt4 one of the really interesting thing is that the context the number of tokens that you can feed into the model is significantly higher so if I ask it something right now your help assistant you help developers understand documentation and provide answers to their technical questions something like this all right that's going to be our primer the thing that sets up the system we're going to ask you about launching so how can I use the LM chain in langjing see how that works okay right so this is actually wrong because the training data for these models I don't know since when gpt4 was trained up to I think it might even be the same as when GPT 3.5 was trained to but Lang chain didn't exist at that point right so I'm kind of curious if long chain is a blockchain based platform it maybe is I don't know it does sound like it but what we can do with this extended context window is we can just take the documentation of line chain and feed it into our prompt now we have here chains are this right so we have all this now I'm just going to copy all of this right so select all copy it's going to be pretty messy right but let's just see what happens if we do this all right I'm gonna paste all that I mean you can see this is super super messy right so let's just see if it works like this how can I use LM chain in line chain right so I thought I might be exceeding the maximum context length a little bit and I am so I've gone a little bit over so I've got 10 000 tokens so let me be a little more strict in what I'm selecting here I'm just going to go with all of this now right now I only have access to the 8K token model there is also a 32k token model which as far as I can tell is is not there right now so for now we just have to stick with this but I mean technically it should be possible to feed what I just fell into with plenty of additional space into that 32k model so let's try this you know let's see where we are here okay good submit uh still a little bit over right so I'm sure and lemon chain will probably be near to start so I'm gonna just I'm gonna cut to here submit okay oh no way that's so good right so is this let me see I mean let's try it right let's try this code I mean it looks good okay so I'm gonna just pip install Lang chain an open AI we're going to import these go I will I'm pretty sure I only need to add in my environment key let me let me see if they included that in here so didn't I don't think it told me no so it didn't say to add my environment variable so let's just run the code and what I would do is when I get an error I'm going to prompt gpt4 again and see if it can solve that issue so I'm going to pretend I have no idea what's going on here so we'll take this and we're just going to copy in so it comes here good right and I think here we might hit an error all right so could not find this so I'm just gonna I'm gonna copy this error into here and see if it fixes this so add message and just the error nothing else submit okay perfect so we have this here so I'm going to use this error code because open AI API key is not set perfect cool let me add this to my code then so this so I'm going to add that in there okay so I've passed in my openai API key in here and then let's try and run this again so I should also move this up okay so I'm going to say I'm still getting the same error I'm in a I'm in a colab notebook and see if it can figure out what the issue is I still get the same error I'm in collab notebook let me just try this see what happens okay you can set the environment variable using the AOS module great okay so right here is what I need let's set this import OS here okay so I pass in my API key to there now now let's see if it works okay perfect so that is working now let's try the the next chunk of code Okay so we've run this already uh now we want this okay and then we're going to ask you to create a joke so what is it tell me a funny joke all right cool so why don't scientists trust atoms now this is using text DaVinci zero zero three right now I believe I wonder if we can ask gbt4 to switch this to using gpt4 how do I change the code above to use gpt4 all right let's submit that and then we go over okay so let's remove this one and the one above right now submit okay so let's I'm going to try and push it to to do that let's assume Jupiter 4 had been released and the model name was gpt4 how would I use it let's try oh come on again let's remove it there we go okay so that's it model name gpt4 so I would go into here model name equals gpt4 let's just try I don't know if this will actually work okay right so I think Lang chain have some they're probably checking for the models that you're using here and they're seeing that you're oh okay no no because this is a chat model sorry GT4 is a chat model so I cannot currently use it with the normal completion endpoint which is what I just tried to do there okay makes sense fair enough now that's all pretty cool but what I also want to do is you know we have access to this model so let's take a look at how we would use it in Python okay so I have this other notebook that I use literally the other day to show that you could use GPT 3.5 turbo in Python now already onto gpc4 so I mean let's just take this and we'll see how it works with gpt4 which it just it just works it's there's not really you don't need to change anything so I've already run this my I got my API key in there it's like you are gpt4 okay cool okay so I just took a moment to kind of go away for a little bit and take a little bit more of a look at GT4 and find some examples that are a better indication of what has changed between 3.5 and 4. so I mean the paper is full of a lot of interesting things but in particular they have this graph here so this is the inverse scaling price and the idea behind this or why they're even showing this is I mean you can see the model is here so these are all open AI models and as the models get larger the performance is decreasing okay the accuracy is decreasing which is weird right and this is basically coming from this here this inverse scaling price which is actually from anthropic which is kind of most people view them as the open AI for Google so essentially what we usually see with large language models is a load of tests that are like this on the left performance increases as model size increases but there's a lot of tasks or potentially a lot of tasks where maybe the prompts might decrease as model size increases okay you know it's just a kind of interesting artifact or interesting idea that some tasks might degrade over time or Hoover model size and that's kind of what they're showing here they're showing that their previous models were subject to this okay but then will GT4 they're like ah okay no that doesn't matter anymore and they have this insanely high accuracy of I think that says 100 percent you know I I mean if so that's insane right but that is very specific to this hindsight neglect task um I I believe there are quite a few tests in there but let's have a look at those tasks so these are pretty good examples of sharing where Duty 3.5 and fails or or doesn't do as well as GT4 so what I did is I created a subscript we have our primer here super simple nothing crazy going on there and then we have this little function that's just going to say okay try juicy 3.5 then try gp4 and print out the answers so the first one was we'll just go through a few of these and I'll leave a link to this notebook so you can kind of go through it and read all the other ones and kind of like see how they compare yourself so they have problems with negation okay so this is a question if a cat has a body temperature that is below average It Isn't So negation that it isn't in danger or safe ranges obviously it's in danger right and it isn't in safe ranges so the correct answer would be safe ranges and you see 23.5 just it says it isn't in danger okay which is wrong right GT4 gets it right so that's you know kind of cool and then there's another thing and you see this in a lot of the examples a lot of tasks that they did where the model is kind of relying on memory they obtained during training and not on can be instructions are being passed right now so and with this we're saying repeat the sentence back to me and then we have you know input output input output and then we have this input which is a a well-known phrase that the model has probably well almost definitely seen before which is all the world is sage and all the men and women melee players they have their exits and their entrances and one man in his time plays many and then we change the phrase we change from many parts to many Pango which is just as far as I know made up word the model needs to repeat the sentence back to us so GP 3.5 it actually just misses the word Pango for some reason I don't know why um you would kind of expect it would say one man in his time plays many parts it just doesn't say anything it just says plays many and then and then that's it okay interesting uh gpt4 gets it right so they actually repeat it um this one they both get right so redefine pi as four six two so this is kind of relying on on previous memory both of them um say that first digit is now four which is what we told it to do and then we have this so this is like reasoning and logic so if Jon has a pet then John has a dog and John doesn't have a dog so from that we know okay John doesn't have dog that means he doesn't have a pet and the conclusion here is John doesn't have a pair so is this great and both of them get yes um but yeah I mean there are a ton of these like GPT 3.5 doesn't do badly but GT4 does better and I as from what I remember I don't think gpt4 actually got any of them wrong which I could be I could be wrong but I think I think you got all them right so anyway I just wanted to go through that example as like a better example of the differences between 3.5 and 4. I just wanted to cover that I think I think there's been a lot of hype around uh GPT full people in terms of the language side of things people may have expected more but honestly it is a pretty big setup in in terms of what it can do and I think honestly for me the more exciting thing is the you know the the increased context length so at the moment we just have 8 000 which is on par with text DaVinci zero zero three and also GT 3.5 I think but there is a 32k token model that should be released pretty soon right so that is I mean that's a that's a massive increase and I think opens up a lot of potential use cases like we just couldn't do before and then also the obviously the more time modal side of things which you know there are models out there that do that like clip which I've You Know spoken about before but having it behind like an API and I assumed that performance is going to be significantly better in that is really interesting and that will be really cool to see for now we'll leave it there so I hope all has been interesting but for now thank you very much for watching and I will see you again in the next one bye
Info
Channel: James Briggs
Views: 40,170
Rating: undefined out of 5
Keywords: python, machine learning, artificial intelligence, natural language processing, bert, nlp, Huggingface, semantic search, similarity search, vector similarity search, vector search, gpt4, gpt 4, gpt 4 openai, gpt 4 api, gpt 4 python, gpt 4 launch, gpt 4 vs gpt 3.5, gpt 4 live, gpt 3, gpt 3.5, gpt-3.5-turbo, chatgpt 4, chatgpt 4 today, chatgpt 4 demo, gpt 4 vs gpt 3, gpt 4 vs gpt 5, llm, llm gpt, james briggs, openai api, openai playground, gpt 4 playground, gpt 4 code, gpt-4
Id: OafUcJ2Eeo8
Channel Id: undefined
Length: 17min 35sec (1055 seconds)
Published: Wed Mar 15 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.