Let's Build a "Chat With PDF” app using Langchain(TS/JS), AI SDK, Pinecone DB, Open AI & Next.js 13

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video we're going to build an AI powered chat app that will enable us to talk to our PDF documents using next.js Lang chain chart GPT AI SDK and pinecon DP this AI powered app is more of a conversational app so that means you can ask any question and it will answer based on the information that it finds on the PDF document if the information that you're asking is not on the PDF document it's going to tell you that hey sorry I don't really have the information to answer you and it's not going to make up any answers this is going to be a long form content where we will architect an app and also build it from scratch and I hope you like it because I learned a lot and I hope that you will learn a lot too with that being said let's get straight in the video since I want to make this slightly a bit more beginner friendly in the next two sections I'm going to quickly explain what are the different ways to make your llms smarter and also give a brief about a tech stack if you already know a lot about GPT apis and the text tag that I mentioned then you can skip to the architecture section where I will explain the architecture of the app that we're going to build if you're not a data scientist you really don't care about building llms rather you care more about building real world apps around llms in order to build real-world apps like our app that we're going to build we have to make our llms a bit smarter to fit our needs so there are two ways to make your llm smarter and make them more specific to our tasks the first one is called fine tuning fine tuning is the process of training your llms further on another set of data to make them act in a certain way so for example if you're building an AI doctor what you would want to do is that you would take an existing llm that's been trained on a large set of data and then you fine tune it or maybe you train it a bit more on Doctor specific data and make it behave like a doctor the second way to do is to add a knowledge store so in this approach we give an knowledge Store to our application architecture and use it to give context to the llm and your llm would give an answer based on the context you just passed this approach is great for search operations when you want to find information from large amounts of data and it is what we will be using to build our app Lang chain is an AI API interface layer that gives you a nice set of apis to interface with several different llms Vector DBS and also most importantly enable you to chain several tasks together and we're also going to use chat GPT apis from open Ai and you should typically open an account with platform.openai to get access to chat GPD 3.5 apis the next one is AI SDK from Verso this is a pretty neat npm library that offers Hooks and apis that further simplify your integration with tools like Lang chain and also Standalone llms for example I build 2 apps one with and another one without the AI SDK by including this AI SDK I was able to decrease about 40 of my code base and you will see soon why and the last one is pinecon DB the pine cone DB is a vector database that enables us to store and retrieve embedding what are embeddings embeddings are just vectors they are nothing but mathematical representation of data and when they are stored on DB they are stored based on their semantic meaning for example if you're converting a word Apple to an embedding of let's say Dimension four then it would be something similar based on each Dimension you will have a value and if I store Grape in the same database and the grape will also have certain values based on the same dimensions and in the database they will be stored close together and if you ask the database to give you hey give me fruits that are close to the color red and it's going to give you apple and grape because on this Dimension the grape and apple lie close to each other on the database this is the architecture for the app we are going to build the architecture is splits into two parts the part on the left deals with preparing our knowledge store and on the right it's about the user interaction and how we get our expected answers from our document living inside the knowledge store the knowledge store is going to contain our document that we would like to search on in the form of embeddings as I explained in the last step embeddings are just mathematical representation of our data and we split our big document into smaller chunks and convert each chunks into an embedding using open ai's text to embed model and store them in the database once this is done our knowledge store is ready now on the right hand when the user chats with our llm we first convert the question to a standalone question based on the chat history the new question is then converted to an embedding using the same open a AI text to embed model that was used to embed the document chunks and we take this embedding output and search for top five matching results from our knowledge tour once we get the top five matching text chunks we pass them as a context along with the new question and instruct our llms clearly to answer only from the context we passed and reply I don't know if the answer is not found in the context and based on the context the llm would give us the final answer we are going to break down our app Dev process into steps so we will build our app in a sequence of 10 steps and we will go through one by one and build each step first step is going to be we're going to set up our next shade app with chat scene UI and also include the dark mode the next one we're going to build the UI components required for our chat app and then we're going to set up all the configurations in the external accounts that we need for open Ai and also for accessing the pinecon DB and then we will set up the knowledge store using the Pinecone DB index and we will chunk our documents and then embed the document and store the document inside our pinecon DB index and we will prepare the llm templates and instances using the Lang chain and we will build a chain using the Lang chain and stream the information back to our client side using AI SDK we will prepare the chat endpoint in order to use the stream and then send it back to our client-side inside nexjs13 and we will use the stream information and the aisdk inside our chat UI in order to make our app more conversational and the last step is going to be we're going to get the source information and send it as part of the stream so that we know which other sources that the llm uses in order to generate the final answers all right so for the step one we have to set up our app using chat c and UI so if you go take a look at the installation instructions for the next js13 you should have all the instructions to set up your app using nexjs13 and also if you come to dark mode you should have the instruction to set up your app using the dark mode so once you're done with this your app should look something like this where you would have all the components that you just installed for your dark mode and for your setup for me at least for this app I had to downgrade my next JS to 13.4.2 so if the latest next year's version works for you use that if not please feel free to downgrade that your global.jss file should look like this if you don't see this then you should probably copy the information from chatse and UI docs another thing is that I've marked my type as module in order to use esm Imports as a default and for the files that use the common JS Imports you have to mark the them as dot CJs files for example here you can see that I've marked dot CJs for my posts users config and my Tailwind CSS config with this uh the setup is done and I'm going to go into my page.t6 file and explain you what's happening here I have a main container and inside the main container I have a sticky header inside that I have a span with a text inside and also the dark mode toggle and if I run the app the app should look like this I can switch between light and dark and also the system and yeah so with this the step one is complete for the step number two I want to build a UI that looks like this so for example I need to have two chat Bubbles and the first one is going to be Ai and the next one is going to be you right and also when we get the answers back we should be able to show all the sources information as well in the chat bubble so this is what we're going to build in this section so the data is going to be dummy data for us and once we have the real data from our chat endpoint we would try to populate that so for now let's go ahead and build this UI so first we're going to build the chat bubble component that we just saw in order to build the chat bubble we need to import some component from shotgun UI let me go ahead and add that we need an accordion component a card component an input component all right so we have installed all the components that we need and on top of that we need to install the npm AI SDK let me go ahead and install that and also on top of that we're going to need a library called react wrap balancer that will enable us to wrap the text properly we're going to need another Library called react markdown in order to show The Source text as part of the markdown alright so we've just installed all the things that we need let's go ahead and build our chart bubble component and so what I've done is I've just imported all the required libraries and the components that I need for my chat bubble first thing I'm going to do is that I'm going to make a simple wrap text method that will take the content and wrap it using span tag so that we can pass this information inside our react wrap balancer and once that's done we're going to have our interface the chat bubble props that will extend the message type from AI SDK react and after that we're going to have chat bubble the chat bubble is gonna use the roll content and the sources information from our chat bubble props and as you can see that we've passed the content information inside our wrap text method and we get back the wrapped message let's say wrapped message and once that's done we're gonna build our chat bubble component and for the chat bubble component we would be adding a card component from the chat scene UI and this is how it's going to look so this is a card component and inside the card component we're going to have a card header and a card content and a card footer so let me go ahead and add the cut header the cut header is going to contain the title of the chat bubble and if the role is assistant you're going to see AI if for all the other roles you're gonna see you and after that we're gonna have card content component the card content component is going to use the wrap message inside the react wrap balancer component and at the end what we're going to have is we're going to have a card footer that's going to contain the accordion component and it will use the sources information inside the accordion component what we're going to do is we're going to check if the sources array is empty or not if it's not empty we're going to iterate through this array and make our accordion component and we use the source information inside a react markdown now you can see a simple utility method here that we use to clean the source text because the source text comes from the knowledge store so it's not properly formatted so what we're going to do is we're going to add a simple method inside our utils in order to properly format The Source text and let me import this information yeah all right so with this our chat bubble component is ready so in order to use our chat bubble component we're going to import our chat bubble inside our chat.t6 component and this component is what we're going to use inside our page.tsx it also requires input and button component we don't really have any data so we're just going to Simply use a dummy data and what I'm going to do is I'm going to create my chat view based on this information so first we would have a parent div that holds all the chat information the first container is going to have the chat bubble information so what it does is that it iterates through the messages array and builds the chat bubble for each message and I'm going to check for the role if the role is not equal to assistant then we're not going to show any Source information if it's assistant then we're going to show the sources information after that we're going to have have a form field that's going to have our input and our button component so I'm going to Simply import my chat component inside my page.t6 file and see if it works all right so you can see that it's working as expected so we've got the dummy data and we can see the text inside the accordion for the AI and not for the user so with this our UI step is complete now uh we're gonna make a DOT end file on the root of the app and you can see that we've got a few things that we need to fill and we are going to do that the first thing you need is the openai API key you can easily get that by making an account inside platform.openai.com and the next one is Pinecone API key and pinecon environment so in the same fashion you would need an account with the Pinecone DB you just have to go to API keys and then make a new API key and once you make an API key you would have the environment information and the API key information that you need and then comes the pinecon index a pine cone index is just an organizational unit for your pinecon database and it's just how the Pinecone database organizes your data so there are two ways to make your index the first one is through the pinecon console and the other one is using the create index API I would recommend do to make your index using the console but if you want to do it through the code you can use the API but you also have to keep in mind that index creation takes a lot of time so the code that tries to access the index after the create index API would have to wait for about three to five minutes so you most likely need to have a retry mechanism or a timeout so that's also the reason why we have a timeout here after the index you see a namespace the namespace basically enables us to organize the data inside the index if you store two documents in one index you can give namespace a for the document 1 and namespace B for the document 2. and when you try to request the data for a specific document then you can use the namespace to Target that document that you want to talk to at the line number 16 you have a path where your document lies what we're going to do is that we're going to install another Library called Zod and use a file called config.ts and pass all the environment where labels inside the schema and then export the schema when we access any environment variables we expect them to be there in this step we are going to create our Pinecone client for that let's go ahead and create a pineconeclient.ts file and also install the Pinecone DBS npm Library the pinecon client instance from the pinecon DB Library will be used to talk to the Pinecone database inside this file we are going to have three methods the first method is going to create the index for us in the pinecon DB as I've told you in the last section I would recommend you to create the index using the console but if you cannot then use this approach for the index creation you use the create API and pass the config that contains the index name Dimension and Metric you can read more about them in the Pinecone docs but for now let's stick to this configuration and we are also going to give a delay for the Pinecone index to get created and if there is a failure we throw an error after this method we are going to have another method for initializing a Pinecone client to be used at the time of initializing the client we also check for the index if it's already there if not then we call the create index method and return the initialized Pinecone client and then we export a get Pinecone client method that Returns the initialized pinecon client to be used across our app we are going to embed our PDF document and store that inside our Pinecone DB so for that what I've done is I have added a new docs folder and inside that I've copied the file that I want to embed and once done I'm going to come back to my console and I'm going to install two libraries so the first library is going to be the link chain Library and then the next one it's going to be PDF parse so we are going to use a PDF loader because we are going to use the loader to load the file and chunk the file so that I'm going to install this Library called PDF parse all right so once that's done I'm going to create a new file called PDF loader and inside that file I'm going to import the PDF loader and also I'm going to import another Library called recursive character text Splitter from Lang chain and my end config so what this file is going to do is that it's going to chunk our PDF file and return the chunked documentation I'm going to load the documentation using the PDF loader and then I'm going to use the recursive character text splitter file in order to split the file into a chunk size of thousand characters and chunk overlap of 200 so that we don't really over a plea break between the chunks I'm going to pass this information to the chunk docs and return the chunked docs the next file that we're gonna make is that we're gonna call it Vector store the vector store will have two methods but for now we're going to only build the first method so I'm gonna import all the required things for the vector store file we are going to have a method called embed Docs that would embed the documentation for us so let's go ahead and do that so this would accept two arguments the first argument is going to be the client that is a Pinecone client and the next argument is going to be the chunked documentation that we just saw inside this method we're going to have a try catch block as usual and we're going to create a new instance of the embeddings and also access our pine cone index that we're going to pass so this index is already initialized index evade pine cone store so this one comes from the Lang chain so the Lang chain gives an interface for us to use with the Pinecone DB so what we're going to do is we're going to pass the chunk documentation and the embeddings that is the open AI embedding so this is an llm API that will be used internally and the text model is going to be textada or O2 I guess and once it's done then we are going to throw an error if there is any failure so this looks good right now what we're going to do is we're going to write our script we are going to create a new folder called scripts we're going to call it the Pinecone embed docs.ts file and inside this script we are going to import everything that we've just made so for example we would use the get chunk stock method from the PDF loader and we will use the embed and store docs method from the vector store and then we will use the get pinecon client from the pinecon client file we are going to add an async method that runs so let me go ahead and explain you what's happening first we get the initialized pinecon client and then we await the get chunks from PDF doc and then we call the evade embed store dock so that looks good and on top of this I'm going to install another Library called TSX in order to run the script I'm going to add a small command in order to run the script so what this command will do is that it will use the INF config and also run the script under the scripts folder now we're going to run the script to see if this is going to create an index and also chunk our PDF documents and embed them and also push them to this index let's see if the script works is expected so as you can see it's chunking the documentation but the problem is the index creation most likely failed this is most likely a Pinecone error because of the fact that we need to give the index a bit more time to get created alright so now it's initialized our index is ready you can see that there are no vectors inside our index so that means there are no embeddings in our index so now we're gonna try again foreign this is successful so you can see that if I click we have 325 vectors and our document is chunked and saved inside our index now that we have embedded our documents it's time for us to prepare the llm template so for example we use two llm instances in our line chain the first one is going to be a non-streaming model and the next one is going to be a streaming model so this one will give us a stream to consume on our client side let's go ahead and create two instances so for that I am going to create llm.ts and I am going to import chat open AI from Lang chain chat models and I'm gonna have two models the first one is going to be a streaming model and the next one is going to be a non-trading model and the temperature is going to be zero for both because we don't want them to digress too much okay so once that's done let's go ahead and create two templates so the first template is going to be a template to generate the Standalone question and the next template is going to be to get the detailed answer so for that I'm going to create a prompter plates file inside our lips folder and I'm going to copy this information so first one it's going to take the chat history and the question that we have and generate a standalone question and the next one is a QA template and that will get the context that is the information that we get from the vector database and put all that context here and then ask the final answer from the llm all right so with this our step six is complete in this step what we're going to build is we're going to build this conversational chain using line chain we're going to use an API called conversational retrieval QA chain from Lang chain so this is how the API looks for example the first llm instance that you see here it's going to be the streaming model that we saw and the next one is going to be the retriever so this is the vector store instance that we have we're going to have some options in order to pass the non-streaming model that will get us the Standalone question so here this section it's going to be this part we're going to use two templates this template that you see here it's going to be for the Standalone question and once it gets the question for us then we're going to use that question to get our final answer so how does it work it works similar to this so for example if we just take a look at this section we're gonna build something similar and yeah so let's go ahead and build this so for that we're going to have a new file called langchain.ts and on top of that I told you that we need to build our retriever so that's a second argument in our API in order to get the vector store we have to go back to the vector store.ts and build an API so this API is going to be the get Vector store API so this accepts a client and at the line number 29 you can see that this gets an instance of the embedding so this is just llm API that you see here in this step and then we get the index of the pine cone and using this information we use a Pinecone store API from Lang chain and get the instance of the vector store and interestingly you can see the text key is text here so there's the same key that we've used when we embedded the documents so we're going to use the same key this will be used as the metadata key for us when we try to filter out information from Pinecone and you can read more about that here in the section called the metadata filtered search and if you want to see how that works you can go to the linechain.ts and check out how this API works and then you return the instance of the vector store to be used in our langchain.ts file let's go ahead and build that and the next line we're going to use some apis from AI then we're going to use our streaming models that we've just built in the last step and we're also going to import our prompt templates and we are going to build our call chain args which will have the question and the chat history we have to build our call change so let me start off with a simple method that is a call chain what we're going to do is first we're going to sanitize the question and the next line we're going to get the pinecon client instance so that is the initialized pinecon client and pass this information into our get Vector store API in order to get our Vector store this Vector store it's going to be a retriever so that's the reason why we have this Vector store from the Lang chain stream we're gonna use two apis if you don't use this information then you would have to build your own handlers as you can see here so we have few handlers and that you would have to build on your own and then you'll have to stream your own data if you use only blank chain but if you use the AI SDK the streaming is done with the AISD case API and it's just going to make your life much more easier you can see we have just built our conversational QA change so from llm as we've just talked about the first option is going to be the streaming model and in the next line we're going to use the vector store as a base Retriever and then inside we have the QA options so this is QA templates that this is the question that we get in order to get our the final answers and in the next line we have a template that will get us the Standalone question from the llm and we also have a question generated chain options that contains the non-streaming model so for example this will be used with this template in order to get these Standalone question for us so this small chunk of code will be used here in this section of the architecture and the QA template that you see here will be used at the last section so and also we're gonna return the search documents to be true because we will use this later and in the next line what I'm going to do is I'm going to call the chain I'm simply going to pass the question and the chat history and we're going to pass the handlers that we got from the Lang train stream from the AI SDK and then we're gonna use this API from the AI SDK and send back the stream in this section you can see that we haven't really awaited the chain call like it's done in the documentation because we want the stream to be available as we get the token now that we have our line chain API prepared we are going to build our chat route in order to consume the apis that we've just built under the app I'm going to have a new API folder and inside that I'm going to have another chat folder and inside the chat folder I'm going to have a route.ts file this route is going to get the chat messages from the users and pass it on to the call chain API that we have built inside our langchain.ts file so I'm going to import the call chain and also I'm going to use the message type from AI also I'm going to have a small utility method we're going to use this to format the chat history and we're going to have a post request Handler and then we're going to access the messages information from the request body and then we're going to prepare a chat history using the messages variable and format message utility and we're going to pass the last item as question if there are no questions then we'll return a nexjs error response if everything looks good then we pass on the question information and chat history onto the call chain API and return the streaming text response and if something fails then we also throw a 500. all right so with this our chat endpoint is complete and our backend is pretty much ready so in the next steps what we're going to do is we're gonna use the chat endpoint and build the front end in this step we're going to build our chat UI so for that we are going to use this use chat Hook from the AI SDK so this is a really nice hook that enables you to interface with our chat endpoint that we've just built and get the streaming output from the chat response so we are simply going to use this API and rebuild our UI let me copy this information go to our chat component and paste it there I'm also going to add some initial messages and I'm going to put them in utils so it's going to be something that's close to our document so for example our document is about the German basic law and I'm going to go back to chats and import my initial message from my utils and the next thing I'm going to do is I'm going to use the message information inside our chat bubble so this is being used here and for now I'm not going to include any sources another thing I'm going to do is I'm going to handle submit I'm going to pass the handle submit onto my form component that I've just built here in terms of input we're going to pass the value as an input and I'm going to pass the handle input change method onto our on change Handler just so that we have a proper UI in place I've also added a small component for the animated spinner so there is this is loading flag from the use chat hook we are going to Simply use that and wire this information this component is a client component so we also have to mark it as such and now I'm going to ask some questions to our AI app can you summarize the document now you can see that we are getting the streamed answers from our AI based on the documents we just embedded so now if I ask some questions that is not in our document it should just tell us that it doesn't really know the answer so yeah it seems to be working great so now as the streaming is happening I want the view to scroll down as it streams for that I'm gonna have a small utility method inside my util.ts file this method puts crawl to the end of The View and it accepts a container ref and when we call this method it will just simply scroll this container to the end we are going to use this inside our chat component and put that inside our use effect so this will be called every time there is a change in messages State and I'm going to pass the ref so let's see how this works so now if I ask some questions as you can see as the streaming is happening the container Scrolls to the end just makes our UI more refined we have come to the last step of our app where we will get the sources information that is used to generate our answers and send them back as part of the stream there is an interesting experimental stream data from the AI SDK kit which will enable us to append the data and send the data as part of the streaming text response so what we're going to do is we're going to go back to our code base and go into our langchain.ts and inside our Lang chain stream we're going to add this experimental stream data to be true and after that we're going to add an experimental stream data in order to append our sources information to the stream data now the question is how do we access the sources information and the line number 36 you can see that we have this flag called return Source documents true that means we would have by default four Source documents coming back as part of our call chain a response and we can access the call chain response by using a simple then Handler and what we're going to do is we're going to get the first two items from the four items that we get back and we're gonna map the value and get only the information that we need and pass it on to page contents once our data is ready we are going to append it to the stream data and do a data.close now we're going to pass the data onto our stream text response and this data can be accessed on the client side as part of use chat hook let's go back to our chat app and try to consume the information with a simple data item so in this data item of the use chat hook we will have the stream data that we stream from our Lang chains the problem is we still have to match the data with the appropriate chat message so for that we're going to use a small utility inside our usual.ts file which will enable us to match or map the sources information with the right AI text message I'm going to go ahead and add this simple method so this method is not perfect I think they're going to make some updates in order to make this easy to do within the AI SDK itself but until then we'll just use this method in order to map The Source system Mission with the right AI message so what I've done is I've just simply used the get sources within the searches prop and let's go ahead and see if this is working is expected let's go back to our app and ask a question and now we get the sources information that was used to generate this answer for us with this our app is complete the entire code base of this app is on my GitHub it's on PDF chat AI SDK and I've made all my updates here so you can also go take a look play around the code and let me know what you think thanks for watching the video if you liked the video please leave a like And subscribe more and I will talk to you soon
Info
Channel: Raj talks tech
Views: 5,420
Rating: undefined out of 5
Keywords: ai, langchain, nextjs, chat with pdf
Id: oiCFr19NtPo
Channel Id: undefined
Length: 39min 52sec (2392 seconds)
Published: Sun Aug 27 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.