Build Chatbots with Memory using Langchain and HuggingFace || ChatPDF with Memory

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hi all based on the previous video on chat PDF many of you went back and wonderfully tried to implement the chat PDF as well as try to incorporate the memory option of LinkedIn in your conversational Bots while some of you faced some challenges and requested if we can make a separate video on conversational chains using memory but in this particular video we are going to cover the topic of memory in land ship as well as we shall see the chat PDF that implements or incorporates memory into its operation okay well I did try a couple of different models that are available in hugging phase I I really appreciate that you are that you brought this to notice and I found out that a lot many of these smaller models in the hugging phase Hub that are freely available do not work very well for our conversation purposes right they do not handle conversations well right the larger models do handle conversations and the one of the best models that I could find so far that it works really well for conversations is the open AI chat model okay now in this particular tutorial I did try that IUI Falcon 7B model flan Alpha GPT 4XL while these models have been designed for some specific tasks they are not very well suited for our conversational chains so one of the models that did work and because it was a it is a three billion parameter model so I did Implement in in this Google collab environment additionally I wanted to share this implementation with you so for this purposes Google collab seems to be a better option now let us go ahead with this so first of all we need to install certain packages like chroma DB pilantic and sentence Transformer you will notice that I have specified here specific versions of these libraries the reason being in one of the latest updates there has been certain changes made to the pedantic library due to which there are dependency issues so we shall see what that dependency issue is as we go through but just understand in order to avoid any conflicts in order to avoid con any errors it is advisable to use these specific version of the libraries currently and install these libraries in the same order okay the ordering matters when you are installing libraries that have dependencies okay so I have already installed these libraries so we installed three chroma DB pyrantic sentence Transformer there are after we shall install Lang chain hugging face up once we have these libraries installed Transformers We Shall import these from Land chain import hanging face up from blank chain dot chains import conversation chain just as we do in all our videos we are setting our hugging face API key here and we are using the lmsys fast chat model for our conversational chain right this model has been found by myself like when I tried with multiple models I found that this model to give us the best results when talking of conversations right for summarization or for other tasks there are different models but for conversation change this seems to work fine so I just created a hugging face object Huggy face Hub object and created an llm model here I have four different queries I have written query one two three and four right so in the memory module of language there are primarily three types of memory modules or three types of memory classes that are used extensively the first is a conversation buffer memory so in conversation buffer memory all the history that has been worked with during your conversation get stored in the buffer right so one of the limitations of using a buffer memory is as you go for longer conversation threads the size or the number of tokens of the history increases this leads to longer times for the model to process at the same time there would be limitations in terms of the token size okay so let us first work through this conversation buffer memory and then we shall see what are the alternatives so I just create a memory object here conversation buffer memory then conversation basically a chain wherein we are calling conversation chain we are passing our llm model another memory object okay this is created now we call this is the input that I am providing the input is hi my name is saurabh I do have some questions for you so for this conversation chain we are calling the predict function and passing the input as a query one so the output generated is hello sort of I'm here to help you with any questions you have how so you see this is not a very perfect model here it we have additional tokens but still the result that we get is relevant okay so we shall stick to it you can play around with other models that are available in hugging face that is freely available right the other query I asked is I live in India who is the first president so it understood the context that I am asking the first President of India and however the answer is incorrect right the first President of India was not Pandit jawaharlal nehru he was the first prime minister so our model did not give us a correct answer but semantically let us assume that this is fine okay now memory dot load memory variables this helps us understand this this will help you understand the history or the conversation chain with memory right let's see so whatever we talked to the model so far has been put in the history you see the responses have also been put in the history and the last conversation that we had with the model has been put as human and a so sorry so the entire chain that we have so far is put under one object that is history right so whatever conversation has taken place is now put under history now when we call our input query when we call our conversation chain and ask what is my name so the it will go back to its history and find out that my name is saurabh so the model correctly predicts that my name is saurav so this is how it is using the memory or the history of the conversation in order to process the output right our fourth query was where do I live so it understands that you I live in India it gives the right answer so here I be I'm printing the memory buffer it just gave me a list of the human and AI conversations as a thread okay when you load memory variables it will present you the history object as well as if you have other variables currently we don't have any as such okay so this is how our conversation buffer memory works so hope this is making sense and you are getting a sense of what we are trying to do here next we will see because of the limitations limitations in terms of the storage limitations in terms of the token for longer conversational threads we shall use another conversation object or another conversation class of the memory in line chain which is the conversation buffer window memory so in this buffer window memory we can specify the number of conversations that we want to keep in our history so I have specified k equal to 1. so what happens here our first query works fine for our second query we pass the context okay so this is fine for a third query I asked what is my name so it says I am sorry but I do not have access to your personal information the reason being my K is 1 so it only stores the current or whatever current conversation has taken place it does not stores any history let me change this K to 2. foreign so you see now we will have the memory buffer so this is my current history when this query was ran when query 3 was run it would have a history which is which is the first two inputs right so basically whatever conversation thread that previously occurs comes under the conversational history right so this is how you see now we have a k equals 2 so we have last two conversations that are put in the history or in the memory buffer so this this was one of the techniques using which we can specify the number of conversation now what if we do not want to restrict it to the basis of the number of conversations rather we want to specify the length of the conversation thread that we want to preserve so for that we have conversation summary memory so in conversation summary memory we specify the max token limit so whatever is there in the max token limit so that length will be kept as part of our last conversation that we had and everything that precedes that will be summarized in the form in into a form such that it makes sense based on the context as well as we are within the token limits okay so let's see how this goes so I am giving my first conversation input query one my name is saurabh I do have some questions so it says hello saurabh I'm here to help you with questions you have the second I passed I live in India who is the first president again the same answer I get the third I passed what is my name so it says I'm sorry but I don't know your name can you help me with your name whatever that is the reason being you see what is the history that we have the history is the human ask the AI if it can help him with the question the first President of India was jawaharlal nehru he was the first what is my name so our last conversation thread what is my name and the AI response is kept as it is whatever preceded it because it exceeded the token limit of 80 correct or 80 tokens it was summarized in the form of a summary right now this summary does not have the name within it therefore it responded that he doesn't know your name so this is how our conversation buffer Flows In Lan chain conversation memory buffer flows in land chain okay so the three techniques you understand conversation buffer memory conversation buffer window memory and conversation summary memory okay summary buffer memory so now using these three techniques primarily here we will use only one of the techniques that is a conversation buffer memory and use it in our chat PDF that we already implemented in our earlier video I'll attach the link in the description as well as in the eye link okay so now one of the problems that I mentioned earlier the reason why you should install specific versions of the package is because the pedantic package has been upgraded and some of the base settings of pyrantic has been moved to pyrantic settings so while we have already raised some issues or basically we have reported a bug in GitHub still there is nothing that we can do with an upgraded version so we are using a specific version that works fine with Lang chain as well as chroma DB okay so just as we did all the installations in earlier video we are installing Pi PDF we have already installed chroma DB sentence Transformer we are importing our land chain chroma DB and the other libraries like Pi PDF flow loader recursive character text splitter hugging phase embeddings prompt tablet so this remains as is as in our chat PDF lecture okay everything remains same I'll quickly show you the change that we made right we importing the API key we are providing the file here right I already provided a you can upload a file to your collab environment and then load it here all the number of pages recursive character text splitter we talked about this earlier okay we talked about it go ahead and check that video out embeddings the hugging face embeddings that we created then the query that we are asking is what is Nave base so what it will do it will look within our document and find the Sim based on the similarity similarity search it will run it now for the model right earlier we used the Falcon 7B model in a chat PDF now here we shall use the fast chat mod model that we are using in this tutorial okay this works better for conversation change as I mentioned so everything remains same here also next we specify a template so using this Con following context and chat history answer the following question so this is just a formatting prompt template that I have created there are three variables history context and question we are specifying the template and our prompt is ready for memory we are creating a conversation buffer memory we are specifying the memory key as history and our input key as question for the retrieval chain here is the change that comes for a retrieval change we need to incorporate memory within our retrieval chain that is the Lang chain that we used right so we specified our llm object we are specifying our chain type we are specifying the retriever the document retriever right we provided a PDF based on that PDF we want the responses to be generated additionally we shall specify the prompt and the memory in the chain type this creates a memory buffer for our retrieval chain right now you see the response I first asked the query what is the mathematical formulation of native base so it returns that mathematical formulation of Nays base is follows or whatever that is whether this is the right answer wrong answer that is a different question but what we want to focus about is whether we are able to get the memory right okay so next query I ask is which machine learning algorithm are we talking about so it rightly remembers that we are talking about nave-based machine learning algorithm and therefore our memory object is ready you can see when I load them load memory variables I am able to see that it is having the history of the first question that we asked our model and there it rightly remembers that what we talked earlier right so our conversation thread gets created as and when we use memory though with more advanced models some of the more advanced models are proprietary models you will get much more better results as compared to open source models you can go ahead fine tune these models work around them see what model works best for your scenarios keep learning if you like the content make sure to give it a thumbs up see you in the next lecture have a nice day bye bye

Info

Channel: datahat -- simplified data science

Views: 1,268

Rating: undefined out of 5

Keywords: data science, machine learning, data analysis, python, langchain, chatpdf, huggingface, chatbot with memory, chatbot using python, chatbot using langchain memory, conversation chain with langchain memory, conversation chain langchain, langchain memory, langchain tutorial, chatpdf with langchain memory, langchain chatbot, langchain demo, langchain chatbot memory, chatbot, langchain memory buffer, working with memory in langchain, chatbot with memory using langchain

Id: Rk7yq4U8CV8

Channel Id: undefined

Length: 14min 50sec (890 seconds)

Published: Sat Sep 09 2023