Chatbot Memory: Retrieval Augmented Generation (RAG) Chain | LangChain | Python | Ask PDF Documents

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] in the previous video we downloaded a PDF from PubMed about pancreatic cancer and created a retriever that searched through the document and found the most relevant chunks of the text and the large language model formulated the answer in this video we are going to show how to make a new chain that allows us to ask follow-up questions in other words we would like to add memory to our chat okay we have created a new file called 4ore memory we start with the pi PDF loader which provides functionality for loading PDF documents let's go ahead and import the PDF [Music] loader okay before we start let's take a look at the PDF [Music] document as you can see it's a review article about pancreatic [Music] cancer this line of the code initializes the PDF loader and then this line uses the PDF loader to store each page of the PDF in the Page's [Music] variable okay let's take a look at the first three documents in the pages [Music] variable as you can see there is a metadata it has a source which is the path to where the file is located and the page number let's take a look at the first three documents metadata as you can see the path of all the pages is the same p PDF and each chunk has its own page number since the pages of the PDF are still quite large we are going to break them into smaller chunks so we will be using teex [Music] splitter and we will allow overlaps of 20 characters and total chunk sizes of 1,000 characters so that the documents variable holds smaller text chunks than a single PDF page now let's take a look at the length of the pages and the documents as you can see the documents have 88 Elements which is more than the 28 of the [Music] pages we now import the open AI API key from this NV [Music] file in this part we are going to use open AI embeddings to convert each of the text chunks to numeric [Music] vectors that means these documents the reason for this conversion is that searching through a large number of text trunks is very time-consuming however numeric Vector comparisons are extremely fast for now we are going to use the open AI embedding functionality so let's go ahead and initialize the [Music] embedding we need to store the victors that come from the embedding into a vector d database in this video we will use this chroma database this is a local Vector database that will be destroyed after we close this document but it is good enough for our purposes so let's go ahead and run [Music] it for some reason it's taking a little bit longer than usual maybe it's because I am recording the [Music] screen we are going to use the chat model of open AI as our large language model and we also imported the output parser so that it converts the output of llm into a string when we need it we Define the retriever to take the question and compare it with all the new vectors in the database and return the most similar chunks of [Music] text this Vector is the data Bas let's go ahead and run the cell if you've made it this far into the video please consider subscribing to our channel to stay updated with our latest content don't forget to ring the notification Bell so you never miss out on a new upload and if you've enjoyed what you've seen so far show us some love by Smashing that like button below your support means the world to us now here's the main part of adding memory to the [Music] chat we first need a question maker the re the reason is that once the user asks a new question there is a history of questions and answers in his or her mind that are not transferred along with the new question the idea is to formulate user's question into a format that has the initially missing [Music] context we are going to use llm to perform this reformulation of the question here is the idea user's follow-up question goes into the llm the LM reformulates the question and gives it some context here is what I just explained this is the question maker prompt which is going to be created from a series of [Music] messages first message instruction to the system which is just a string telling the llm to take the history and the question and reformulate it into a standalone question we are instructing to not answer the question just reformulate this is a placeholder that we just imported here and it will take the chat history later on and this is the question from the user this is the question chain which reformulates the user question into a question with a history it is made of the prompt llm and output [Music] parser here we are going to take a look at the question chain and see how it works let's assume that we previously had asked about the shape of the moon and the AI had responded that the Moon is spherical now user follows up asking for further explanation but does not give the context the question chain has to add the context to the original question in this cell we invoke the question chain we pass the question which would be can you explain more and we also add the chat history as a human message with the following content so this is the history of a chat we are making up for [Music] demonstrations and here is the reformulated question as you can see the context is added to the question and here is the prompt for question and answer it consists of a python list of system instructions which is here a placeholder to take the chat history later on which is here and user's question which is here let's now run it to create a CU an a prompt we also need a function to decide which question to pass to LL M we Define a function that takes the chat history into account if there is a history it will pass the question chain which means reformulates the user's question if there is no chat history it will pass the user's question directly okay this is the function that we just talked about this is the condition to return a reformulated question and if there is no history this is the user question this part of the code I have commented since it only adds line Breakers between the retrieved chunks of the text it doesn't have much effect at this point we Define a retriever chain which is defined here the idea is to add a chain to pass the following variables the context for which we use the vector retriever the question which is the reformulated one if there is a chat and the chat history which is a python list of chat messages for this purpose we use runnable pass through to to add the context variable to the other two variables that come as inputs let's take a look at an example of the retriever chain in this example we look at the extra context variable here that is added to the question and the chat history variables so the retriever chain takes the chat history and the question as input [Music] variables okay it seems like I have forgotten to run this cell let's now run it again okay here the chat history and the question are being passed as they are received and this assigned function is adding the context to the dictionary and pass them all to the next Link in the chain and here is the main chain the retrieval augmented generation the rag [Music] chain it is made of the retriever chain which is defined here the question and answer [Music] prompt which we defined here and the llm we might want to add the output parser to the chain so that we get a pure string out of [Music] it and this is the question about the pancreatic cancer now we need to add the chat history as an empty list because in the beginning there is no chat this is the rag chain which we defined here and it takes the question and the chat history as inputs and then here we extend the chat history we are assuming that the chat history is a list of structured human messages therefore we cannot have this output parser in here so let's go ahead and remove it let's now run the [Music] cell and here is the output of the AI and this is the answer to our [Music] question we could also print out the content of the AI message this is the follow-up question without this history here we are invoking the chain and passing the follow-up question and the chat [Music] history which is extended here in this line we add the response to the chat history again let's take a look at the response by executing the cell as you can see it is working because when we ask to explain more the AI is Ware of its previous [Music] response
Info
Channel: CompuFlair
Views: 3,827
Rating: undefined out of 5
Keywords:
Id: PtO44wwqi0M
Channel Id: undefined
Length: 12min 9sec (729 seconds)
Published: Mon Feb 26 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.