Build an AI RAG Application with LangChain & Next.js

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

rag stands for retrieval augmented generation let's build a rag application with Lang chain and [Music] nextjs hello and welcome I'm Dave today we'll build an AI rag application with Lang chain and nextjs and I'll provide links to all resources in the description below I'll also provide a link for you to join my Discord server where you can discuss web development with other students and you can ask questions that I can answer and receive help from other viewers too I look forward to seeing you there hey guys we recently built a simple AI chatbot application with nextjs and that's the type of project I believe at a minimum that you should currently have in your portfolio of projects however this week we're going to make it so much better we're going to take our simple chatbot and turn it into a rag application using Lang chain and nextjs let's quickly look at what rag retriev augmented generation actually means to us we start with a large language model we powered our chatbot with open AI but llms are generalists and they're trained on publicly available data and that gives us three problem areas with llms the first is a lack of domain knowledge llms don't know anything about your specific data the second is hallucinations large language models will sometimes simply make up answers and the third is data cut offs large language models have training data cutoffs and don't know anything beyond that certain date we can improve all of those issues with rag that's retrieval augmented generation we can provide domain specific data and by supplying our specific information we will also reduce hallucinations and we can keep our data updated as often as needed so how does rag work in our simple chat app we asked a question which was passed to the AI llm which then provided us with a response but with rag we add a step we start with a question prompt and then there is a data retrieval step then the prompt with the data context is passed to the AI llm and a better response is provided so let's get started setting up the Lang chain framework I'm on the introduction page of the Lang chain documents and it says here that Lang chain is a framework for developing applications powered by large language models and that is what we're going to build today day it also says Lang chain simplifies every stage of the large language model application lifestyle and like any framework should it should abstract and simplify some things for us although Lang chain is still complicated to work with I also want to note while we're on the docs that Lang chain has both Python and JavaScript documentation this is actually the introduction page in the python docs and you can go to this drop menu here at the top where you see the parrot and the chain and choose the jsts docs so I'll do that we get just a little bit more information here the page is different it says Lang chain is a framework for developing applications powered by language models that are contexta aware so you can connect a language model to sources of context and that's what we'll do with the rag pattern where we retrieve some information from a document and then we will use prompt instructions to interact and read here with the large language model about that document and use that information so we can actually have a chat conversation in our application I'm in the GitHub repository for the starter code this would be the simple AI chatbot that I created in the previous video that I'll also link to in the description and you could start with this and improve the chatbot like I did but I don't normally do this but I'm going to recommend that you actually download the completed source code for this video and simply follow along because I'm going to go over the different processes and Concepts and give several different examples and then at the end you should take this project and make it your own so look for those links in the video description I'm also once again using open AI for this tutorial so if you don't have an open AI account you'll need to go there and set up an account because we'll need access to the open AI API hey guys just a quick note on the open AI API it does require a credit card for your account but it only costs pennies I've been using it a lot and I haven't spent much at all now there were some comments on the last video that complained about having a cost for anything and I realized that I provide many free videos and there's lots of free resources on the web but when it comes to working with this new AI technology open AI is the one that most consider and that's why I'm using it in this tutorial please don't complain in the comments about the cost spend a few Pennies on yourself to learn the latest technology okay moving on okay guys you have downloaded the code either the starter code or the completed code and you've got your open API key and or open AI API key and you should have that now in your env. looc file that we see over here in the file tree I'm not going to show you mine and of course you shouldn't put yours in GitHub either you'll need to create this file even though you download the code from my repository now after that we're looking at the dependencies here you won't have your node modules installed so you will want to open up a terminal window and type npmi or npm install and press enter and that will install all of the same dependencies that I have and that is especially important for this tutorial if you want it to last oh I don't know to next week or next month or a few months after I publish this because both nextjs and Lang chain are libraries that are or frame works I should say that are constantly changing and updating so there could be deprecations and things break if you simply install the latest packages but if you install using my package Json so after you've downloaded the code just by typing npm install you should get the exact versions of dependencies that I have here and your code will work exactly as mine does in this video now compared to the simple chat application tutorial that I had previously created I have added the at Lang chain slopen AI dependency and then also simply the Lang chain dependency those are the two new ones so now let's look at some of the code that we had before and first the frontend component chat. TSX not much has changed I just want to highlight that the used chat hook that we get from the versel AIS SDK I have put in a couple of options one is the API that where we're going for the endpoint the route Handler by default if you don't specify this it goes to API chat and that's what we have here in our chat endpoint under API so we do have a route. TS file there that we were previously using but I'm going to change this for every example you can see I've got four examples over here in the file tree today then I also added an onerror Handler just so we could see an error in the console I found that's very helpful so you'll want that in there as you experiment with this as well so for every example we're going to look at a different route TS file we'll look at the one that we had from the last tutorial first you can see there aren't many Imports at the top just open AI from open Ai and then open AI stream and streaming text response because we want to stream that response back and get it as fast as possible this has the runtime Edge now just since I made that last video vercel has decided to move away from the runtime Edge so you'll see in my other examples that I'm actually going to use that export cons Dynamic equals force Dynamic and I would recommend changing this one as well and maybe I'll do that before I go ahead and upload the completed source code here for this example other than that I'm bringing in the open API key here and then we have the post request that is sent in we get the chat message and then of course we open up open AI here and we use the model that we pick you can pick any model you want I'm using GPT 3.5 turbo set stream to True send our message we get a response it's very simple then uh stream that back here using open AI stream and the streaming text response so that is the code we previously had let's quickly run that just so you can see how it works in the browser as well so I'll get rid of that npm stall I'll type npm runev we'll get this started on our Local Host here I'll control click Local Host 3000 this should start it up and open up the application I'm in dark mode mode you could switch it to light mode in your global. CSS file if you want to so I can just type any question I want here like what do you know about Kansas which is the state that I live in in the US and we've got a fairly lengthy answer back here and this is basically a wrapper around open AI so this was our simple chatbot let's see how we can improve this now uh by learning about Lang chain and eventually we'll add that rag pattern with our own data okay back in vs code I'm going to close the terminal window let's go to this chat. TSX file and switch chat to ex ex1 there we go and so we'll use this example one endpoint now and now let's look at that route Handler so I'll open that up in the file tree and we'll see a few things that are different notice some of the dependencies are different right away one of the things I noted again that things change fast and get deprecated is the used chat hook needed some help with these Lang chain examples so I also needed to import the create stream data Transformer and we use that along with the streaming text response then I'm bringing in chat open AI instead of our previous example that just used open AI directly now we're using Lang chain so I'm bringing that in from Lang chain open AI I also need a prompt template that we're going to send a prompt to chat with and then I need the HTTP response output parser from Lang chain here is that chain I noticed here or I noted before Dynamic equals force Dynamic instead of the edge runtime and then here's the route Handler itself again a post request we get all of the messages as we did before notice now I am getting the last message from an array of messages here and that is the content so we want do content so I have that one message now I'm not getting the message history yet but we're preparing to do that in the next example so now I'm just getting the current message I've also got a prompt creation here I'm using that prompt template from template and right now I'm just passing in the message essentially the question I'm going to send and then I have the model this is a little bit different than that first example because now we're using Lang chain so we use chat open Ai and we pass in the API key some previous videos and examples I've seen of this it used to bring in the API key automatically and you didn't have to provide it here it would just read it but now they're recommending that you do add it here so note that uh once again the model and then the temperature which is something we didn't previously set the temperature kind of gauges the responses and how they will vary so if you're working with very factual data you would probably want to set this very low maybe even to zero but if you want a variance in the responses you can set the temperature higher however you may also get more hallucinations with a higher value and this goes from 0 to one so notice I'm at 0.8 okay we'll scroll down here we're defining a parser now and that is that HTTP response output parser that we input and now we Define a chain and this is what Lang chain is all about so we created our prompt above then I'm saying dot pipe and I'm passing in the model and then I'm saying dot pipe once again and passing in the parser this is a very simple chain very easy way to create it here as well and we'll look before we're finished at a more complex way to create a chain but this is a simple chain for now just passing in the question I'm once again awaiting that stream and I'm passing in a message and after I get that by the way I've commented this out here but you could do this if you wanted to actually look in your console to see each chunk of the stream and you would see like one word at a time on each line inside of your console so I left this commented out for you but you can play around with that and now we're sending that streaming text response but note this is the thing I said about the used chat hook that it now needed this create stream data Transformer and they may fix this in the future where you could once again just pass in the Stream that was created up above here here is just the stream you could pass that in in parenthesis all of the examples show that I had to actually dig into some GitHub issues to see that they were currently working through something with the Lang chain response and how they send that with the used chat hook or how they receive it so here is the streaming text response and we're sending that information back and we have all that in a try catch so we could catch an error if we needed to okay with that we've already changed this endpoint to example one I think we still have the server running here yes we do so let's go back to Chrome and test this out now so now I can type a question like tell me a joke see what it says why couldn't the bicycle stand up by itself and it gives us the answer right away because it was too tired H you know I don't know how good the joke is but the point is it got what we were telling it now again we've just implemented Lang chain but this is basically like our first example but we're using Lang chain now so it's still once again a simple wrapper around open AI hey guys I hope you're enjoying the video you may be surprised to learn that three out of every four viewers nearly 75% of all people who watch my channel aren't subscribed so I just wanted to take a quick second and remind you to hit that subscribe button it really helps me out and if you really like my videos you can get exclusive content and support my channel even more by joining my patreon at patreon.com davay thanks for your consideration and now back to the video but what if I wanted a conversation here where I say tell me a joke and then I just wanted to maybe ask the question which is a pattern it seems to have have and then I wanted to say something like says why couldn't the bicycle stand up by itself maybe I want it to wait so I can say why and then it answers because it was too tired well the only real way to do that is and I'll show how to stop the text here when it receives a character like a question mark but the only really way you can do that is to have the chat history as well so it can refer back to what it asked you in the first part of the joke so let's set that up an ex example two back in vs code I've got the chat component open once again I am changing that endpoint to example two I keep typing three changing it to example two and now let's look at the route Handler for example two very much like the first one as far as the Imports notice one additional endpoint now we have message as versel chat message that's the only thing we changed up here at the Top If we scroll down now notice we have a format message function where I am using that message type that verell chat message and we are formatting all of these messages and that's because we're going to have a message history so we're going to have a message role where it's either the user or the assistant and we're going to have the message content and now let me go ahead and press alt Z and you can see I have added a template here as well before we were previously just passing in the message now there's a lot more information here so our template says you are a comedian you have witty replies to user questions and you tell jokes then we have the current conversation noted with a chat history placeholder here then it says user and it shows the input and then it says assistant which would be the AI so that is all of our template note the back ticks here because this is on multiple lines so I could create a template on multiple lines wrapping that in back ticks like a template literal but note also of course these placeholders aren't quite the same you don't put a dollar sign in front so it's not the template literal that you're used to where you're passing in a variable or something in that manner so you just put those inside of these curly braces okay let's scroll down and look at the route Handler itself once again we get the messages but now we need that chat history again so we have formatted previous messages we get that by using that messages array and we're slicing and then we're mapping it over and passing in that format message to uh format each message in the history we're once again getting the current message as well and getting it the same way we did before now when we create our prompt we're passing in our template from above that I defined up here and then we once again Define The Prompt we still have the model nothing has changed here except notice I also passed in verbose true so now if you look in the node js or nextjs console here if we control back tick to open up the terminal once we start using this you'll see a lot more output because verbose true kind of has this walk you through everything that it's doing so I wanted to include that so you knew about that I left the temperature where it is for now but notice what we're doing here when we create the chain the parser's the same but now as I pipe the model in I'm adding dot bind and I'm telling it to stop at the question mark so that will get it to stop at the end of its question and not give us that second part of the joke so it will absolutely stop and it won't actually like know what it's going to say next we're going to have to provide that history so then it can read the large language model can read what it actually previously asked us and then of course it will see our response that will be why or what or something simple like that so now when we have this chain. stream that we await here we're not only passing in the input with the current message we're also padding or passing in the chat history notice this is an array and we're using join and we're joining on the line breaks here the slash in after that we stream it in the same way we did before so that is the difference so let's go ahead and make sure yes we're on example two let's go back to Chrome now and it looks like everything has refreshed I'm going to go ahead and manually refresh just in case now I'm going to say tell me a joke and it should stop says why don't scientists trust Adams now notice it doesn't have the question mark there because I put it in that bind and told it to stop there it omitted the question mark but besides that we did get it to stop after it asked us the question and then I can just say why now it should read the history and then it knew right away what it had asked us and it replied with the second half of the joke so that's a way to tell that it's actually using the history and also if we go back in and look at this verbose setting that we had we can see exactly how it thought through all of that but there is a lot here to look at so we'll have to scroll back to the very first one that it did and here we are so you can see it received all of that direction from the template and if we scroll down you can see it sent why don't scientists trust atoms of course after it said sure when we said tell me a joke but then it received the stop after that that has all it sent it just had this here in the keyword arguments as well so we'll come down here now here is the next entry and you can see what else it received so here is that content once again but it has more than it did before now it had the current conversation and it said user tell me a joke assistant sure and here was the joke and then user again why waiting on the assistant reply reply so it had all of that conversation to base its reply on and then it gave us the reply which was the second half of the joke moving to example three I'll close the terminal let's go to the chat component and now I should be able to change this to three and after that let's look at that example and you noted from the last example we were able to kind of add a comedian personality to our chatbot as well well this now is the example that is more from the docs as far as the template was where they turn the prompt or they use the prompt to turn the chatbot into a pirate named patchy and his directions say all responses must be extremely verbose and in Pirate dialect so that gets fairly interesting as well here once again you can see we're passing in the chat history and the user input nothing else really changed I just wanted to give a different example of how you can actually prompt through Lang chain and add a personality to to the large language model so let's go ahead and go back to Chrome once again and I'll refresh just to be sure and now let's tell patchy that we want to hear a joke so I'll say tell me a joke and let's see what we get here oh pirate dialogue for sure says r m you be wanting a joke do you and I'm not going to read through all of that I don't sound like a good pirate at all but the point is you can add a different personality and we didn't hardly change anything else else except those prompt directions but this is just kind of how we work with Lang chain and we've improved and added a personality to our bot but now how do we actually apply that rag pattern how can we add in our own data that extra step well let's look at the final example back in vs code I'm going to change the chat component to use example four and after saving that let's look at the route Handler for example four and this will have a few more changes to than we saw before and actually I have two examples in one here we're going to use Json data because that is frequently used on the web and we're going to look at how you can bring in Json from a document but also how you could possibly just use a Json object and chat with the data that's in that object so to start out with we have all of the same Imports that we did in the other examples up here at the top but now I've added a few more so I added line uh space here on line 9 just to delineate the difference here so you can see what else we're adding we're adding the Json loader we're also adding runnable sequence before we were piping our chain and this is just a different way to do it and we needed a little bit more complexity in this example I also have format documents as string because we're going to have documents and that's what Lang chain calls the data that would load in from our Json loader for example then I've also got this character text splitter and you can see right now I'm not using that but it is in one of the examples and that's because you would use it with a Json object and instead of a file so I wanted to also have that in here so notice here before we haven't even got to the route Handler yet but here is my loader and I'm loading in some Json data I have a data file here in a data folder called states. Json it has information about all 50 United States and when I'm loading this I could just the file here with this first Pam but I'll press alt Z to wrap this down here I'm actually pointing to specific properties in the Json data so you can do it that way too you can load a file but maybe you only want a few of the properties I'm loading quite a few but I just wanted to highlight that that you can load specific properties instead of loading all of the properties after that we once again have the dynamic equals force Dynamic we still have the format message function now let's look at the template it's changed quite a bit says answer the users questions based only on the following context if the answer is not in the context reply politely that you do not have that information available now we're providing a placeholder for the context that we're also identifying here in the template we still have the chat history now we have a user question before note I called that input you could still call it whatever you want but it just made more sense to me in this uh template as I set it up I used question now let's get to the route Handler we still bring in the messages we still have the message history with format previous messages I still have the current message here is where we're loading the documents from the Json file and once again that's what Lang chain would call these so each object would be a document essentially and it has its own format for that so we get docs and we use await loader. load but let's also look at how we could do this with an object instead so I'm going to comment this out then I will uncomment this object example here that uses the text splitter that I talked about before so uncommenting all of that here is one state object with the different properties that I wanted to load in the state of Kansas where I live in the United States I'm using the text splitter then the docs equal await text splitter. create documents and then inside of a an array bracket here I have got Json stringify and I'm passing in my object of course you could pass in more than one object if you wanted to because we're just using Json stringify that you're probably already familiar with but you could have a prior fetched or create a Json object and pass it in this way if you want to so I'm going to run the example with just this one object to start out with and then we'll go back and then we'll read the file and it will have data on more than just one state so we can compare once again when we create the prompt we just pass in the template the open AI chat has changed just a little bit notice I have gone ahead and set uh the temperature to zero so that was an important difference here I left verbos to True also I can't remember if I had streaming in the others or not we were streaming using the versel AI SDK anyway but I've got streaming set to True here inside of chat open AI as well okay we get down to the parser and it's the same but now this is what has really changed here is that runnable sequence as we Define our chain we've got runnable sequence. from and I'm passing in some input now this input comes from when we call the chain here at least most of it does because the question receives input and you can see that's input question these all just get passed down the chain the next is chat history it's input. chat history which relates to what we pass in down here as well notice the context we don't really pass anything in there we've already got the docks from above we don't pass that through here this is just separate here so we have format documents as string which was imported at the top as well because this does need to be a string and I'm passing in the docks then through the chain we go to the prompt to the model and the parser just like we did with the previous piping but we wanted to send this information in here as well so that was definitely different than before we needed a spot to pass in that context and this shows you how to use a runnable sequence and it's a simple one because Lang chain can get much much more complicated and you can chain together many separate chains that you create however this works for this and we get our history we also get our rag because we've retrieved that document and all the data inside the Json document or in this case the first example I'm going to use is just this object and then we're going to stream that back everything else works the same so let's check this out now in the browser with example 4 in our chat component and we should be able to chat with our Json object about Kansas and here I'll say what can you tell me about I can tell you about Kansas that's really all it knows about right now so let's ask a few questions like what is the Capital Kansas and the capital of Kansas is Topeka that is correct let me go ahead and ask what date was it admitted to the US so notice I said it instead of cansas but from the chat history it should still know we're talking about Kansas and we get the date back that it was admitted so this is what I would expect now let's go back to VSS code and switch it to use the Json document that has all 50 States okay I'm back in vs code in example 4 I'm going to uncomment line 49 where we load the docs from the loader and then I'm going to comment out line 52 through 67 where we had our large object for the state of Kansas but now we should get data from our states. Json file that should have all 50 states in it let's go back and check that out okay I'm going to start the conversation with hello and see what it says oh how can I assist you that's nice I'll say what is the capital of Kansas Topeka nice short answer but that works now I can say what is the nickname of California the Golden State nice now I'm going to say what state did I ask about before California and it remembers I asked about Kansas I'm going to say does it have a Twitter notice I used it in the context of my conversation so I'm talking about Kansas because I asked about that when I said what state did I ask about before California let's see what it says say yes it does okay I'll say what is it there we go twitter.com kg government that would be the Twitter account for Kansas now I'm going to ask for something that I know is not in the context file and remember our prompt told it to only provide that information if it was in that file now I'm going to ask it for something that I know is not in the context file and remember our template in our prompt it asked it to only provide the information that is in the context file so I'm going to say what is the state bird and let's see what it says I do not have that information available now if we hadn't told it to do that the large language model could use its other sources and actually say that the state bird is the Western Metal LK and it would be able to do that so it's all impr prompt engineering which is another huge area to learn about so congratulations you now have a rag application now learning Lang chain can get very complex but like most things the more you work with it the easier it will become I recommend learning more about it by building your own project do something like a chat with with your resume or a chat with a PDF document Lang chain provides loaders for all different types of files and data and from there you'll get started learning more about embeddings Vector databases and all types of retrieval Chains It's a large topic and there's a lot to learn but without a doubt it will only benefit you in today's job market let me know your thoughts in the comments hey guys giving a quick shout out to my patrons holy coder is a progress provider and lad is a member at the senior level also thank you to all of the junior members you're all helping me reach my goals and if you haven't checked out my patreon it's got exclusive content and early release content and it's not one of those patreons that doesn't get many posts I'm active on there every week so please check it out if you haven't remember to keep striving for Progress over Perfection and a little progress every day will go a very long way please give this video a like if it's helped you and thank you for watching and subscribing your helping my channel grow have a great day and let's write more code together very soon

Info

Channel: Dave Gray

Views: 15,073

Rating: undefined out of 5

Keywords: build an ai rag application, ai, rag, rag app, rag application, build a rag app, build a rag application, nextjs, next.js, js, langchain, rag langchain, ai langchain, ai rag, ai nextjs, ai next.js, langchain js, langchain next.js, langchain nextjs, langchain node, langchain typescript, ai app, ai application, chatbot, langchain chatbot, ai chatbot, rag chatbot, nextjs chatbot, how to build a rag app, langchain intro, lang chain, rag llm, rag ai, rag app tutorial

Id: YLagvzoWCL0

Channel Id: undefined

Length: 33min 34sec (2014 seconds)

Published: Tue Apr 30 2024