Build a Next.JS Answer Engine with Vercel AI SDK, Groq, Mistral, Langchain, OpenAI, Brave & Serper

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video I'm going to be showing you how to build out your own perplexity style llm answer engine completely from scratch so we're going to be building the front end we're going to be building the back end I'm going to be showing you how to use all of the different Technologies such as the inference apis I'm going to be showing you how to use reg I'm going to be showing you how to use Lang chain I'm going to be showing you a number of different search engine apis we're going to be using the opening eye embeddings API there's just a ton packed into this so it's going to be a little bit of a longer one than my typical videos it's also going to be definitely more of a technical deep dive for this video but if you're interested in seeing how I built this out stay tuned I'm going to go through exactly how to set this up and I'll walk you through all of the different steps and my logic in why I set it up the way that I did to get started with this I'm going to put a link within the description of the video where you can go ahead and clone the repo now there's a couple prerequisites to get started with this you are obviously going to have to have node.js installed you're also going to have to fetch a number of different API Keys you can go ahead and get started for free with the brave search API now the reason I have two different search apis is brave gives you only one query per second So within the example that I showed we have the search results themselves but then we also have images and videos so I decided to use a second search engine API where I'm using a serper and the nice thing with serper is you can try it out for free without even plugging in a credit card and get 2500 free queries then we're going to hunt down an API key for open open AI we're going to be using open AI just for embeddings you could also swap it out for a different embeddings model if you'd like and then finally we're going to get an API key from grock so if you head over to console. gro.com click API Keys generate an API key and then once we have all of that and you have it all cloned we're going to head on over to the root of our project and you can go ahead and mpm install all of the different packages so once you're all ready with your API keys where we're going to actually put them is within a EnV so I have a env. example here and you're just going to put them in one by one so you see we have open AI groc Brave and the serper API so once you have all of those plugged in here we're going to be ready to dive into our application the way that I'm going to set this up is we're going to go through the backend logic first then I'm going to hop over to the front-end logic and then I'm going to go through individually each different component that is rendered on the screen the other thing that I don't think I mentioned is you will be able to deploy this to verell by the end of this so if I just demonstrate this here and I say latest Sam Alman podcast with Lex so you'll be able to put this on versel you'll be able to interact with this there if You' like you can set up some authentication and do all of that I'm not going to be doing that within this video but you can just see how it works within the versel platform here it's really easy to set up so if you have the versel CLI set up you'll be able to just go ahead and deploy it through your sell account for free or alternatively if you just want to run this locally you will be able to do that as well now one question that I anticipate will likely get is can I use olama with this so you absolutely can you can locally query Gro if you'd like you can also query gbd4 or a different model like Opus or whatever you want but if you want to use olama it's really simple you'll be able to swap out the endpoint width in the example you'll be able to swap out the API key for AMA and then just changing out the string for the model ID for the different areas for the chat completion and a couple other spots and you'll be able to set this up locally so I tested that a little bit earlier if you are an AMA fan or just a fan of running things locally just know that you'll be able to get the inference portion of this set up if you're interested in using a local model so to get started with our backends we're going to do everything within the back end pretty much all within our actions if you want to break this out so it's a little bit more modular you can do that to get started what we're going to do within our action. TSX is we're going to first import a number of different dependencies so I'm going to go into these a little bit as we go through all of the different code here so the first thing that we're going to do is we're going to initialize the opening eye client with the grock API the nice thing with the grock API is it conforms to the opening eye schema so we will be able to query it just like we would the open AI API now the nice thing with the openi API standard is you will be able to swap it out like I mentioned for something like ama if you want so if you want to use AMA you can just go ahead swap Swap this out for your local host and then change the API key to I think it's AMA and then like I mentioned change it within the chat completions and a couple other spots to get this all working locally we're just going to set up some of the different interfaces for the different types that we're going to be using within our for our first function I'm just going to zoom in a little bit here so hopefully you can see this a little bit better I'll also close my sidebar a little bit so hopefully you can see this and what we're going to do is we're going to declare a function called get sources now what this is going to do is we're going to take in the message as an argument and then we're going to pass that message to the search engine API now another thing that you can do especially if you're sending in particularly long queries you could pass in that initial query that a user is typing in within something that's going to optimize it for a search engine API so you could very easily use something like an llm as a rephraser and ask for a message back for a search engine API I have that in a similar video which I'll link Within description of the video which you could use as an interim step here now that would increase the time of this but it will also prevent some edge cases where if you're passing in a particularly long query it could potentially error out on the endpoint if you put in a really really long message here the way that this works is we're going to encode the message we're going to specify that we're going to get 10 different results from there we're just going to parse the web response for a handful of different things that we're going to be using so now that we've performed that search query and we've got those result results from Brave we're going to go ahead and go through those proverbial 10 Blue Links from Brave and then we're going to go ahead and scrape each page now one thing to note with this is I have a timeout of 800 milliseconds here so if there's any Pages that're taking a long time to respond I don't want it to hang up here I want it just to exit out and say if it's taking longer than this just don't use that result and return that back to me the one thing to note with this is since we're just making a crude fetch request if you notice that it's not fetching all of the results you could also use something like Puppeteer here but that's definitely going to increase your compute cost as well as the speed potentially on how quickly it's going to be able to run that query but if you want more results pup tier is certainly an option that you could explore here as an interim step within this function so from there we're just going to do a really crude extraction of all of the different main contents here we're going to remove a bunch of different things that we don't need so we don't need to take any inline Styles or scripts or what's within the navigation or footer most likely so we're just going to get rid of all that and we're just going to get as much of just that raw text as possible and then from there we're just going to create a promise where we're going to concurrently go ahead and request all of those different options and then we're going to have it extract all of the different page contents for us and then finally once it has those contents it's going to return those back to us within the array like we have specified here so once we have that we're going to go ahead and process and vectorize the content using Lang chain So within this function we're going to send in the query we're going to specify the chunk size so this is going to be something that you can play around with I might even add a UI component on the front end where you can play around with this and explore different options of different chunk sizes or potentially different rag methods I'll look into maybe adding in something like this but this is just a basic rag implementation so here we're going to have our chunk size of a th000 characters we're going to have a chunk overlap of 400 characters we're going to be returning four similarity results so these are going to be the default options you could also call the function with different arguments if you'd like to swap some of these things out so what we're going to be doing here is we're going to Loop through all of those web pages that we just got we're going to go ahead and split up the text with the recursive character text Splitter from Lang chain we're going to go ahead pass in our split up text we're going to query that opening eye embeddings API and then we're also going to be setting up some metadata and then finally we're going to be performing a similarity search on the web page with the query with the number of different results that we want from that web page so each web page is going to have a number of different results based on what you've specified here so in this function we're just going to go ahead and get the different images very similar to above we're going to query the endpoint you can use Brave you can use serper you can use whatever you want within the responses here and then what we're going to be also doing within this function is we're going to go ahead and actually Fetch and query all of the different images to see that we actually get results back so the the reason I wanted to do this is when I was setting it up I did notice that a number of the images wouldn't necessarily load I set up a little bit of a helper function just to make sure that we can get that request before we try and render it to the Dom and it shows a broken link So within this function you can use serper you can swap it out for brave you can really use whatever you'd like within here from here we're just going to be sending back the title and the link so you could use the title for something like the alt tag if you'd like and then I'm just going to slice it to nine because that's what I have set up within the UI but you could select a different number if you'd like and then for using serper so this is how You' use the query for serper so if you want to use the serper images API you can just go ahead put this above within the query so if you want to use serper for both images and video you can just easily Swap this out for the above and vice versa if you say you just want to use Brave you could use that within the videos option here so very similar here we're going to also check to see that the result of the image is successful for what we're going to try and render to the Dom and then similarly we're just going to be sending those nine results back once we have that set up we're going to be using a new feature that's within the Grog endpoint just as of last week which is the Json mode so if you specify that you want the response format within the type of Json object you can go ahead and pass in a system message just like this with a message of what you'd like to send in here so here are the top results from similarity search I'm going to Json stringify all of those different results and based on those results I'm going to be getting the followup questions that we're going to see see at the end of the application there next we're going to be declaring the main action function so this is going to be really what orchestrates all of the above functions as well as actually streaming those different results to the front end within here we're going to set up a new create streamable value what the create streamable value allows us to do is it allows us to send that Json as it comes through right to the front end here and you'll see that in just a moment here so once we have that declared we're going to set up an asynchronous function we're going to log out the user message when it comes through so we're going to be setting up a promise for get images sources and videos and then what we're going to be doing is once that resolves we're going to be sending all of those to the front end so you're going to have that effect where they all render in at the same time and you're not waiting for all of these different results one by one for it to load you're going to go ahead make all those requests concurrently and then similarly it's going to concurrently send all of those back to the front end once they've resolved since we have our source now we're going to go ahead and take these sources which are all of those different results that we got from Brave we're going to be passing those within our get 10 Blue Links function and then what we're going to be doing with those different results is we're going to be passing that within our process and vectorize content function that's going to return the vector results and once we have that this is going to be where we declare our streaming chat completion within the system message here I see a little bit of formatting that I'll clean up once I push the repo here but the system message is here is my query we're going to send in that user message we're going to respond back with an answer that is as long as possible I wanted to do that just to really demonstrate the lpu speed from the grock inference chips because they're super fast and it's sort of neat just to see how many tokens it can generate just as quickly as it does then I'm going to say if you can't find any results respond with no relevant results found and the user message is going to be here the top results from a similarity search and then we're going to Json stringify all of those Vector results this is going to be all of those top results with the metadata and we're going to send that in and we're going to wait for our response here now the cool thing with this verel AI SDK and leveraging that create streamable value API is we'll be able to iteratively go through the stream as they come in and just send them right to the front end so it did take a little bit of a while to figure out how I wanted to manage State and all of that but once I had that figured out it's pretty powerful what you can do with this so kudos to the team at versell they really did a great job with implementing this new versell AI SDK I've had a lot of fun play around with it from there we're going to go ahead we're going to generate those relevant questions so this is going to be where we have that Json mode within Gro and we're going to ask it to respond in that particular schema that we asked for we're going to pass in those results that we had here and we're going to be asking it to generate follow-up questions then we're going to stream those out once they come in you can also stream other things you can stream all sorts of stuff you can also stream out UI so full components if you'd like from the back end there's a ton of different stuff I'll be covering it within more videos likely within the future with this new vers LDK but this is just something I'm going to clean up within the repo once I push this live then the other thing to note is you do have to make sure that you do call done once it's all done you will get errors in your console if you're just updating and you never actually call that done function so then from there we're just going to return that streamable value and then we're going to Define our initial AI State and UI State just like we see here so once you have that that's pretty much it for our back end here I'm just going to go ahead make a quick get add get commit update actions going to push that to the repo here so for the front end of our application we're going to be importing a number of different dependencies here but then the main part that I want to call out here are these are all of our custom components here if you want to go ahead play around with the different components that I've set up here if you want to add in different effects or different functionality you can go ahead Within These components and you should be able to play around with these components pretty safely without affecting other portions of your application similar to our backend we're going to be setting up a number of different interfaces here for all of the different types that we're going to be using then what we're going to do is we're going to set everything within our page here we're going to be setting up is the my actions which is again from the versel aisk so this is going to be how it interacts with that actions TSX within the back end of our server then what we're going to be setting up is a number of different things for our form submission handling so on key down on enter on all of those different things we're going to have that state for the actual input itself all of that good stuff is here then from there what we're going to be doing is we're going to have a simple state for all of our messages so the state that I chose to set up our application for the most part is all within this messages array within each result we're going to have an array item of an object we're going to have the followup within there we're going to have images within there videos within there the sources and then once the actual llm response is complete we're also going to put in that within this messages state as well but while we're streaming out those responses from the llm to the front end we're going to have an interim State which is the current LM response which is what you see here so then what we're going to do is we're going to set up a Handler for when the user clicks the follow-up button because originally when I was setting this up it was just set up on that initial form input but I wanted to set it up when you click the follow-up questions so just to make it work on both when you're typing in the input or when you're clicking those follow-up questions that's why we have it broken out a little bit here and the next we're going to have a simple use effect that's going to be called whenever the input ref gets changed so on key down it's going to be looking for all of those different changes on the input to be able to set the state for what the input is we're going to have a few different functions for handling our form submission as well as clicking on those follow-up questions and next what we're going to set up is the handle user message submission so here we're going to create a unique key which we're going to pass within the props of each of our components when we iterate through them so it has something unique we're going to be setting the date for all of those so it doesn't cause a multiple renders we're going to put that key on each component so this is going to be what that message object looks like so we're going to have the ID the type the user message content images videos followup whether it's streaming or not as well as the search results once we have that we're going to set the message and then this is going to be where we're going to be awaiting all of those different asynchronous results from our action so this is going to be where we call the action so as those different responses come back from the back end we're going to be able to read those as a stream and then as we're reading through them this is going to be how we go ahead going to be where we orchestrate where all of those different values go within the front end of our application so as that readable stream comes in from the back end we're just going to set up a loop where we're going to be waiting for all of those different results and then this is going to be sort of what orchestrates on as those things come back from our react server component and they get streamed back from our back end this is going to be what delegates where in the different spots of the states the different things go so if it's images we're going to put it on the images key if it's videos we're going to put it on the videos key Etc then from there we're going to set the current llm response and then from there we're just going to really render out our view here so you see we pretty much have everything within our messages Loop here which I'll just go into in just a moment here and then the bottom half of it is going to be just everything for our simple form submission here so once we have all that we basically have three pieces that we're going to be returning here we have the messages array which I'll go into just a moment we have a chat scroll anchor this is going to be how it scrolled to the bottom as the messages come through and then we also have the input itself this is going to be the text area where we actually put in the message and where it has the button to submit and all of that stuff but the main portion is going to be within this messages array and then as they come through if they exist we're going to render them within these sequential chunks here so that's something that as I was setting this up I realized where we really had to set up state in a way where it made sense to sort of group all of these things in but we also had to be mindful that certain values are going to be streaming in they could be updated potentially multiple times and all of that stuff so here this is going to be where we see we have the key for our different components here set up so you have the search results up top then from there you have the user message so what you've actually submitted then you have the llm response component so you're going to see whether it's a current llm response and that's going to be streaming in from the component or once that's done it's going to set that llm response as that historic State and we have the follow-up messages that's going to appear below this once it comes through and then finally on that side bar there we have the videos and images which you can render and you will notice that each of these do have the key these are important just so it doesn't cause multiple rerenders and it has these Keys which is how react nose not to sort of excessively render all of these different components since it is within a loop and all of that stuff so you can also look into memorizing different components if you'd like now those are the main pieces of the application the way that I set this up is for all of the custom components that I've set up in the application are all within the components folder and then within the answers directory here so you can go ahead check them all out there's essentially a different component for all of those different core elements of what you see on the screen here so we have the search results component we have the user message which is the simplest component because it's just going to show what you submitted it's going to show a couple logos we're going to have the images component I set these up in a way where they're varying in complexity so if you're new to react or next I'd take a look at the user component try and play around with it if you're looking for something that's more adventurous check out the videos component this is going to be how we actually get the videos and we show that if frame in the corner there where you'll be able to have those embedded videos that are sort of floating absolutely within your application and yeah I encourage you play around with this if you have any ideas for the application put them within a comment of the video here put them on GitHub send me a message on Twitter do whatever you like to do but that's pretty much it for this video I hope you found this video useful I had an absolute blast building out this application I'm going to be setting that up with a very permissive of license so you'll be able to take this go ahead and build a business with it do whatever you want with it hopefully you learn something if you found this video useful please like comment share and subscribe consider becoming a patreon subscriber or a subscriber on YouTube or support me in whatever way you'd like even a GitHub star would be great let me know what you'd like to see in any upcoming content within a comment below you know where to find me until the next one
Info
Channel: Developers Digest
Views: 8,887
Rating: undefined out of 5
Keywords:
Id: kFC-OWw7G8k
Channel Id: undefined
Length: 21min 25sec (1285 seconds)
Published: Sun Mar 24 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.