New ChatGPT API - Build A Chatbot App

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

openai recently released the API to the most advanced language model GPT 3.5 turbo this is the model that powers Chad GPT and is 10 times cheaper than its predecessor gbt3 it can be used for both conversational chat and your standard gbt3 like completion that we're all familiar with let's learn how to build this conversational chatbot app using that new model along with spell kit and Verso Edge functions we'll start out with the openai playground to get a better look at how this new API actually works before we start to integrate it into an app so on the left here we can see that we have a system input this is where we can prime the model by giving it an identity or some context that it might need from the start for example we can say something like this we could say you're an enthusiastic and witty customer support agent working for Hunter byte your name is Axel and you're happy to help and the AI is going to use that information when responding to the user so if I enter a message into the user prompt and say who are you and then click on submit it'll give us a response that we might expect right so we can continue the conversation by informing the insistent that we're trying to learn how to code so if we click on submit here it's going to ask us what language we're interested in learning and then we can just say python it's then going to retain the context and render out a nice list of resources to Learn Python or how to install it and practice now it's actually happening here though is that we're setting the entire chat history with each request it then reads that history in order to determine the current context of the chat before it actually generates a response so if we were to remove some of these last few messages here let's just say let's remove this this this and then we say something like what resources did you provide me with let me click submit it's going to tell us that I have not provided you with any resources yet so it doesn't actually have that context sitting somewhere on a server we have to pass it with each request the entire history of the chat so it knows what we were talking about before and what information it's already given us so that's something important to keep in mind so now let's go ahead and get into the code as always the starting and final code can be found in the description below and we'll be starting out with the Styles and markup for our main app page here as well as a chat message component which changes its appearance based on if it's a message from the user or if it's from the AI nothing too complex going on here these are all days UI Tailwind components I've also went ahead and added my openai API key to the EMV file as openai underscore key don't worry this key will be deleted before this video is published but you can try to use it if you want anyways now let's start out by installing the openai node SDK and then we can set up an endpoint at API chat with a plus server.ts file inside of it and this is actually going to be a post request Handler so we can change this to post we'll bring in our types here and then we also need to destructure the request from the request event that gets passed into this request Handler so now let's take a look at the openai API reference documentation so we can actually get an understanding of how we are going to create these chat completions so we can see here that the request body requires a model and messages so if we look here at the example they have messages which is an array of messages we have roles and content now the three roles that we have access to are assistant which is the AI we have user which is our user and then we have system which is that message we set it up with from the beginning right and then if we look at the response here it is returning a message with a role of assistant which is the AI generated response as well as some content now something you'll notice here is that it's not actually returning all of our messages back with that response it's only going to return the response to that latest question therefore we actually have to keep track of the messages ourselves so that on the next request we can send it back and it has that full context like we discussed a few minutes ago now since we will be streaming this data in it's important to know that the stream is terminated by a data done message which is going to be a part of that server sent event setup that we're going to do in a few minutes so what's going to happen is our client side is going to send an array of messages via the request body to this endpoint here we're then going to take those messages send them to the openai API to get a response and then stream that response back to the client so the first thing I'm going to do is set up a try catch block and then let's just make sure that our open AI key does in fact exist the one that we set inside of our environment variables if it doesn't we're just going to throw a new error then we can grab the request Body by assigning request data to awaitrequest.json and what this request data is going to look like it's going to be an object of course it's going to have messages with an array of messages okay and that's what we're going to send from our client side we'll be setting that up here in just a few minutes so if request data does not exist or if it's falsy we'll just throw a new error and then we'll set request messages which is going to be of type chat completion request message which comes from openai and it's going to be an array it's going to be equal to requestdata dot messages and then if we're to take a look at this type here we can see that we get a role which is going to be one of either system user or assistant we're also going to get content which is the contents of the message and then name here which is the name of the user in a multi-user chat we're actually not going to be taking advantage of that but that is here if you want to explore so if you recall from earlier we said that the entire context or the entire chat history is sent with every request right well openai still has a 4096 token limit on a single request right so each one of those chat contents count as tokens that it has to process so we don't want our token count to exceed 4096. so how can we do that how can we prevent the messages from exceeding that what we need to take advantage of is a tokenizer and we can install one called GPT three dash tokenizer and it's going to be a library so we'll say gpt3 Dash tokenizer and there are other libraries out there I just found that this one works well for me and then what we'll do is we'll actually set up a new file inside of our lib directory and we'll call this tokenizer.ts now the way we actually have to set this up is a little bit weird I found this workaround through the GitHub issues I'm not sure if this is a spell kit problem or a gbt3 tokenizer problem but it's kind of weird so the first thing we'll do is import gpt3 tokenizer import from gbt3-tok tokenizer this is the default export and then we'll set a gpt3 tokenizer variable and it's going to be a type of gpt3 tokenizer import which equals we're going to say if the type of gpt3 tokenizer import is equal to a function so if it's a type of function then it's going to be equal to a type of gp3 tokenizer import otherwise we'll say gpt3 tokenizer import as any dot default again I know this looks weird but this is the way that I was able to get this to work if you know of a better way please let me know in the comments down below so the next thing we'll do is set up a tokenizer and it's going to be a new gpt3 tokenizer and it's gonna have type gpt3 and then we'll export a function which is called called get tokens it's going to take in an input which will be of type string and it's going to return a number so the number of tokens so we can get the tokens by using the tokenizer dot encode method and we'll pass in our input and then we'll just return the tokens dot text.link which is going to be the total number of tokens that this given string contains so now back in our endpoint here first thing we'll do is check to make sure we do in fact have request messages if not we're going to throw a new error and then what we're going to do is we're going to set up a token count so we'll say let token count and it's gonna be set to zero so then for each request message we're going to get the token count of that message and then add it to this token count variable here so we'll have a total number of tokens right and we'll use this here in just a few so we'll say requestmestages dot for each we'll get each message and if you recall the type of these messages it has the doc content it's also going to have dot roll so what we care about is we care about getting the tokens from the message.content and then we'll add that to token count I won't actually be using this until a bit later but it's good to go ahead and get it out of the way now so now we're going to do is we're actually going to hit open ai's moderation endpoint and it's basically to prevent us from getting banned from using their apis if our users pass in some crazy stuff so it's essentially an endpoint that gives you back some results and if those results are flagged we can then throw an error and not let it continue on to actually submit the message to the actual chat GPT API endpoint and if we look at the API reference and we scroll down here to moderations we can see that we pass in an input and then it gives us some results and then if something's flagged it'll be true if not it will be false so let's just set that up now so we'll say moderation response is going to be a fetch request to API to openai.com V1 moderations we're going to pass it some headers one is going to be the content type which will be application slash Json the other is going to be our authorization header which is going to be a bear and we're going to pass in the open AI key that we have from our private environment variables it's going to be a method of post and then we're going to have a body which we are going to set json.stringify and the input is going to be Rec messages and then we'll say recessages dot length minus one which will give us the very last message here right because the other messages we should have already vetted now if they do some crazy stuff there we're not going to account for that in this video but of course we're gonna get the most recent message they sent which will be the last message in the request messages array and then we're going to get the content from that and pass that to our moderation endpoint and then we can get that response with moderation data moderationres.json and then if we look at the response here it's a results property which has an array of objects so what we'll do is we'll just say we want the first of that array from moderationdata.results and then ifresults.flagged then we're going to throw a new error so if all that's good let's go ahead and Define our prompt so our prompt is going to be you are a virtual assistant for a company called huntabyte your name is Axel Smith and then we're going to add this prompts tokens to our token count so we can just say token count plus equals get tokens prompt and then we'll say if token count is greater than or equal to four thousand you could do a number of things here one thing that I would say you should probably look into is potentially just ripping out that first message out of that request messages array as long as there is at least two messages in that array right so you can remove back the older messages so starting with index 0 and and kind of work your way down and delete some messages until your token count is less than four thousand or you can just throw an error here which we could then on the client side just kind of reset the messages object there's a few different ways to do it for this simple example here I'm just going to throw an error but of course definitely explore different ideas and ways to handle this more smoothly and then what we're going to do is we're going to construct a messages array that we're going to pass to our chat completion endpoint so we can say messages it's going to be a type chat completion request message it's gonna be an array and it's going to have a starting message so that system message right so I have a roll of system and the content is going to be the prompt that we created up here and then we will just spread the rest of the request messages here and then we can set up our chat request options so let's just set up a new video called chat request Ops and this can be of type create chat completion request it's going to be an object it's going to have a model which is going to be GPT 3.5 turbo it's also going to have messages that we just created right here it's gonna have a temperature we'll say 0.9 keep it frisky here and then we'll have a stream which is going to be true and it's saying that we can't use the namespace we actually need to import this as a type so it's just set set that up like so now we should be good to go so we can actually issue a request so we'll say chat response is going to be a fetch request we're going to make to the chat completions endpoint which can be at API openai.com V1 chat completions and we're going to pass the typical headers here authorization which we will need that Bearer token for we'll have the content type that we're submitting which is going to be application Json the method is going to be post and the body will be json.stringify the chat request Ops that we just defined up here so the next thing we'll do is check to make sure that this response was good to go so if it wasn't then we will just get the error and then we'll throw a new error with that error message and then if everything is good to go then we're just going to return a new response and it's going to be the proxied response the way we get that stream right and this stream is true and what this stream is doing is telling the open AI API that we want a streamed response back we don't want just a regular Json response back so then we can proxy that stream response back to our client side through our own endpoint like so so we can just set up the chatresponse.body and then for the headers we'll set the content type to text event desk stream and then what we'll do is we'll get rid of this response here and then if we catch any errors we'll just console them here and then we'll return Json which comes from JS kit like so okay we should have our endpoint functioning as we would expect and let me update this import statement here so it doesn't look so sloppy and yeah everything was good to go so we're getting the request data we're checking to make sure it exists we're then getting the messages from that request data or from that request body if there's no messages then we're going to throw an error we're then setting the token count and then we're going through each one of the request messages and then adding the amount of tokens per message to the total token count here then we're going to run our moderation request where we check to make sure their message isn't saying anything crazy if the results are flagged we're going to throw an error otherwise everything is good to go and we can move on to constructing our prompt and then we add up our prompts tokens to our total token count before we check to make sure that we are not over the 4000 which I believe is actually 4096 for this new API but I'm just going to leave it at 4 000 and then for the messages we're going to construct new messages array with the first message being our system message here that contains our prompt and then we're passing the rest of the request messages ever passed from our client side here that we're defining our options which we have the model messages temperature and stream this is important and then we're making our request over to openai to get that completion if everything is good to go we are returning it back to our client side with a content type of text event Stream So now let's set up our client side and the first thing we have to do here is we actually need to install a package called sse.js and if you're using typescript it doesn't come with type definitions so we actually just set one of them up so we can just make a new file inside of lib called ssc.d.ts and I'm just going to paste this here but you can take a look at it it's basically all the types for this library and this Library unlocks a couple different capabilities that don't come out of the box with Event Source so it's extending the native Event Source adding a few more options that we're going to want for servers and events okay so in our page dot svelt so our homepage.spel the first thing that we're going to do is set up a couple different variables so we're going to have one called query which is going to be of type string and we'll set that default to an empty string we'll have answer which is also going to be a string we'll have loading which is going to be a Boolean it's going to be default to false and then we'll set up a chat message array we're going store all of our chat messages because remember openai's API does not keep track of our messages for us it doesn't return them so we have to have some type of way to keep track of them on the local state and since our server side is going to be serverless or Edge functions we need to keep that state here so we'll set up a chat messages array which can we have type chat completion request message array and we'll set it to an empty array to begin with then what we'll do is Define a function here we'll call it handle submit it's going to be asynchronous and we won't do anything at the moment but we're going to use that to add to our on submit of our form which will also prevent defaults we don't have the page reload and we will just call this like so we'll say handle submit and then we're going to bind the value of this input to query like so so now whenever we submit this form it'll call this handle submit function here now what we're going to do is we're actually going to keep all the chat messages from our client side and from the assistant inside of this chat messages array so what we want to do here is when you submit this request we're going to first set loading to true we're going to set chat messages equal to whatever's currently in chat messages right so we're going to add some stuff into this later so whatever's currently in inside of chat messages we're going to spread out across here first and then for the last index we're going to add Our Own latest message so we'll say roll is going to be of user and the content is going to be the query so whatever was just submitted in that form right and then we're going to set up a new service and events connection so we'll Define an event source which will be a new SSC which we need to import from ssc.js and then for this we'll just set up to be slash API slash chat the headers will have a content type of application slash Json the payload will contain those messages like so so remember on our server side we're expecting a message property right here so request data.messages to be our request messages that's why we're doing this here then once we point this variable to that new SSC what we'll do is we'll clear out our query so we'll set query to an empty string so the next question can be populated and then we're going to add a couple event listeners to this event Source the first of which We'll add is going to be an error and we can actually set up a function to handle errors right so any other errors occur inside of our application we can handle them here then I'll just Define a new handle error function here it's going to be a generic we'll just take in whatever air type we get we'll set loading to false we'll set query to an empty string we'll also set answer to an empty string so we're pretty much clearing everything out and then we'll just console the error for now but feel free to do whatever else you want to do such as throwing up a toast notification or something like that so then when we added the event listener here to error we can just pass the handle error function and then it's going to pass that error event into this it will retain that shape and work its magic and then the next event listen we're going to add is going to be for messages right so message and this is going to be the tokens as the tokens are generated they're going to get passed here and that's how we're going to be able to render it across the screen as they come in so if with each message we're going to have a new event like so so I'll set up a try catch here we'll set loading equal to false because we're no longer loading we have at least one part of the message right and then we'll first check to say if e.data is equal to done like so instead of brackets so we look at the documentation we can see that partial message Deltas will be sent like in chat GPT whenever stream is set to true and if the stream is terminated by a data done message so this is how we know that we are done receiving tokens from the stream whenever we get this message here so if data is done we're going to set chat messages equal to whatever is currently in chat Messages Plus the latest message we got back from the assistant right so we can say assistant content is going to be answered which we're going to pre-populate with the stream tokens as they come in and then we'll set answer equal to an empty string here because we're now done and then we'll just return and then if it's not equal to done like this we're going to get a completion response which is going to be equal to json.parse e dot data and then we're going to do is we're going to have an object with a choices property which is going to be an array and we want to get the Delta from that array and then the content on that Delta is where our tokens live so we can set up like this we can say Delta so we're taking the Delta Property we're destruction the Delta Property from the first or index 0 of this array and it's going to be completionresponse dot choices and if Delta dot content so if Delta dot content so if Delta dot content exists if it's not undefined we're going to assign answer so if answer currently exists we're going to use answer otherwise we're gonna use an empty string plus the Delta dot content so we're basically adding those tokens on to answer as they come in right that's kind of how we're going to fill in that entire chat bubble it's going to keep coming in one by one if it's the first message it's going to be an empty string plus this content the next message that comes through it's going to have something so it's going to append it to the end of that like Chad gbt does and then if we catch any errors we'll just say handle error and we'll just pass in the error we're already consoleing it here so we will get a console on our client side and then outside of this event listener but inside of this handlesmith function we're going to call eventsource.stream which basically tells it to start sending messages over the service and events and now we should start to get some messages so if we start up our Dev server and head to localhost 5173 we'll see these components that I currently have placed here which aren't actually doing anything they're just here for demonstration purposes what we want to test is the ability to receive a stream response so a type of message here and I say hello we're getting an error and if we check back at the API chat I actually put HTTP instead of https here for this endpoint so let me just check the rest of them really quick and we should be now good to go so they come back here and type in hello we're now going to get the request streamed in so we are in fact getting that data back from the server let's now render it on our page and get our chat functional so what we'll do is we'll come down here we'll get rid of all these chat messages like this and we'll set up in each block so we'll say each chat messages as message right because that's where we're storing all of our messages we can say chat message type is going to be message.roll right so we already have access to the role and that's why I Define the component in the way that I did and the message is going to be message.content now one thing we actually have to remember is that we're not actually adding the answer or the streamed in answer to chat messages until it is finished streaming so we'll also add a check here so we'll say if answer then we want to set up another chat message with a type of assistant and a message equal to answer and we can also set up something so for loading so we can say if loading we'll have a chat message which is going to be type assistant and the message is going to be loading like so so now if we come back into our application here let's move this out of the way a bit and let's just say hello we now see hello there I'm Axel Smith how can I help you today I want to learn to code type python you can see now that message is being streamed in but it's actually down here and we have to keep scrolling down to get access to it so let's set up a little bit of a helper function that'll help us automatically scroll to the bottom of this container whenever we send a new message as well as when new messages are streamed in so I'm going to come back into the app here I'm just going to go up to the top I'm going to define a new function called scroll to bottom and the reason we're setting this up like this is to add a little bit of a delay in there because sometimes the HTML isn't finished rendering and it doesn't go to the right spot so this is the way that I found to make sure it happens every single time so we can set up a timeout here which is going to take in a function and we need to assign a div to be the scroll to div so right now I already have one set up so at the bottom of this container here where all these messages are being rendered I have this div set up here so we can do is we can set up a new variable called scroll to div it's gonna be a type HTML div element and we'll just say bind this equal to scroll to div like so and we'll come back up to our scroll to bottom function and we'll say scroll to div dot scroll into view we'll set the behavior to smooth block to end and in line to nearest and then here inside of this timeout still I'm going to set this to 100 and then I'm going to add this scroll to bottom function to this event listener for messages so we'll say scroll to bottom we're just going to call this like so and then we'll also add it at the bottom here underneath of stream this to make sure that it happens when you first submit the request so that we have our newly submitted message visible and then also as the response comes back from openai we scroll down to make sure we can see that new message as it's streamed in as well so now if we save that come back in our application say hello I'm just going to type a couple messages here and we can see when I submit this new message here we get scroll down and then as the message is streamed in our container Scrolls down to the bottom of the message and then one last thing that we can do to make this look a little bit better we can just take one of these chat messages here and we'll just place it at the top and we'll set it to be the assistant and then for the message we'll just say hello ask me anything you want that way when someone visits the website they have this prompt here already set up so they know what they're going to do okay cool and it doesn't actually get sent off with the rest of the requests and all that stuff it's just there as a visual aid all right now let's deploy this application to versel taking advantage of both the edge functions as well as the serverless runtime so the first thing we need to do is install the svelt JS adapter for cell and then I'm going to come into my svelt.config.js file and we're going to change adapter Auto to adapter versus Cell and then within this adapter we're going to have an object and we're going to set the runtime by default to node.js 18.x so by default it's going to run the serverless node runtime and then we can actually get more specific with each function so for example our post function here request Handler we can actually set this up to run on the edge so the first thing I'll do is import type of config which comes from the adapter for cell and then we'll just export a config it's going to type config and we'll just set the runtime to Edge so then let's just commit all this code to GitHub and head over to the Versa dashboard where we can deploy a new project and then I'm going to select that git repository when I'm deploying a project we need to set our environment variable so let me just grab those from the dot EMV file using or sells incredible copy paste we now have it in there and then we can click on deploy and then it will take a few seconds but eventually we will get the congratulations we just deployed a new project to Purcell so we can actually go and check it out now and as we can see it is working as expected and that's going to wrap up today's video so if you got value out of this video don't forget to drop a like And subscribe let me know what type of content you all want to see next in the comments down below thank you so much for watching and I will see you in the next one [Music] foreign [Music]

Info

Channel: Huntabyte

Views: 124,401

Rating: undefined out of 5

Keywords: sveltekit, sveltekit tutorial, sveltekit crash course, sveltekit form actions, sveltekit forms, sveltekit actions, form actions sveltekit, api route svelte, svelte routes, layouts, advanced layout sveltekit, sveltekit icons, sveltekit extensions, sveltekit productivity, svelte search, svelte react, svelte searchbar, svelte filtering, svelte stores, how to add search to svelte, search data svelte, svelte data search, redis sveltekit, sveltekit cache

Id: dXsZp39L2Jk

Channel Id: undefined

Length: 22min 45sec (1365 seconds)

Published: Sat Mar 04 2023