Chatbot Answering from Your Own Knowledge Base: Langchain, ChatGPT, Pinecone, and Streamlit: | Code

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey hi everyone in this video we will build a chatbot using Lang chain openai chart GPT and the pine cone why pine cone because we want a chart GPT to answer from our own documents let's say your own internal company documents or your own knowledge base then you know its own memory so that's what we're going to see we actually already have you know couple of videos related to let's say chat GPT chatbot using stream lead then we also have a video related to the pine cone but you know uh many people reach out to me and say can you build an end-to-end video where we are using stream lead chat you know that UI then we want to use the chat GPT UPI but it needs to you know answer from our own document right and it also needs to answer the follow-up question so let me show you what I mean by that so let's say this is what we're going to build let me ask some questions which I have here right so I'm asking this thing like you know how is India's economy okay and then it's going to answer not from its own knowledge but rather the documents what we have that's what you know we have done so now you know it's trying now so it is answering you know that uh India is the sixth largest economy so this might not be the factual correct because the documents are actually generated using chat jpd itself for the demo purpose the purpose is to show you that you know it is actually answered from this document from the pine code index and you know then uh the another important thing you know what is let me tell you now let me ask something like this you know uh I think I just type those thing here right what if I ask this thing and its culture now this is not a sufficient right uh and it's culture now let's see what's happening so it should be you know able to answer that actually I'm talking about the India's culture even though I didn't mention but this is nothing but one let's say when you use the chat chipity UI you know that you know that chat GPT uh you can have the chat chipity memory and it knows that my first earlier question was related to let's say India's economy and then I'm asking you know and it's culture so definitely actually I'm asking about the India's culture right but do you see something here refined query which refine this and its culture into the what are the cultural you know challenges facing in India or something related to India though it is not that good accurate you know I will tell you how this refined query is created and what its purpose so we know that so if you could remember you know we had an earlier video where when you ask query that needs to be answered from your own you know Vector database or the semantic search we pass that query through the embedding and then eventually we find the relevant document from the pine cone but if you look at what query we we have if if you pass only this query and its culture what kind of semantic document we will get matching because it's not sufficient to what it needs to match against the pine cone though chat CPT UI can understand it charged conversation can understand it but in our case let's say we want to bring some context so that chat chipity can answer to get that relevant context this query is not sufficient and that is the reason we need to do something called refine query where you need to transform your query you know into something that can be now match now you can take this query you know some cultural challenges facing in India then you will get some context related to the cultural challenges of India so that's what the refining the query it could be you know let's say we we do something like this you know and its relationship with USA right whose relationship with USA right so that is something we need to refine and we will see how how do we make sure that even we can do a semantic search properly when the user query is not sufficient now see we get the answer for the India and United States related information but if you look at this thing our refined query how does it look it looks like what is the relationship between the India United States so when we only had relationship with USA because what is our refined query logic is doing it is taking our initial conversation users current query and then creating and refine query that can be actually you know check against the pine index and so that you get the relevant contact so we will see how to do that kind of thing right so let's first of all do one thing you know let me delete this index so that we can start from the scratch so we can go pine cone if you are not familiar with the pine cone as I told right I have built you know made multiple YouTube videos where the Pinecone is a part of it right so you can go and check those videos right so here also you know in this video we have built the custom you know question answering kind of thing using pine concept you can go and check so now in this video actually we are combining both of this thing a gpt4 conversational capability with our own document semantic search kind of thing right so let's say we are you know we deleted that index now create the fresh index okay now what should be the dimension of the index so first of all we're going to store vectors here and what we're going to uh store so actually I'm going to use this Transformer model you know from the sentence Transformer and it's actually Dimension is 384 Dimension we will see how to get that you know Dimension but I will you know create it here because it takes some time to actually create so this Dimension how to get we will see now let's create an index it will take some time to create the index now what do I have I have a notebook where I have indexing logic so let's say in a data folder I have my documents in a different format something could be in PDF something could be in a txt format I want to process these documents and I want to insert those documents or Index this document into the p and uh this pine cone that's the first thing we want to do and once we have our index document then we have our streamlit chat you know kind of application which can have a conversational capability nothing but answering the follow-up questions but they are actually getting answered from the pine cone index knowledge base right so since it is getting created let's let's look at this code and again this code is actually part of this tutorial what we have received you know earlier cover uh you know related to the line change semantic search with the pine cone right the only difference is that last time I use I guess you know uh the open AI embedding but here I want to show you how to use the let's see the transform enemy right so I need to uh require this couple of same requirements what we had earlier we are installing let's say Lang chain openai we require sentence Transformer because actually we are using sentence Transformer embedding and these are some of the things you know uh which is required by the Lang chain when you want to process your files to create you know uh you want to process these files and create chunks a small small paragraphs or the you know chunks that you can store in the pine cone kind of thing this is also covered in jarlier video but in this video I am going to cover whatever required for this purpose right so these are all things related to uh what we're going to use actually the Lang chin which is going to pass this directory and redoasing here I put a link so that you can go and read more about it actually okay it is saying not found maybe you know that documents got updated but anyway so whole idea is that what we're going to use we're going to document loaders from the line chain one of the good thing about the line chain is that so when you want to build something like a semantic search you require some way to process a different kind of files also split those files into the small small chunks so that you know you can store this chunks in a semantic search kind of a vector database right so Pinecone has something called a directory loader which is nothing but the take the path of your directory where your documents are there and it will simply load this document for you right so that's what it does and eventually we will see you know it it use that directory loader and it is found there are three documents right not only that it also has a you know uh so since these documents could be bigger right a couple of maybe they could be like a couple of thousand or like maybe 50 000 words or that hundred pages right we can't put everything in a single chart because ultimately we want to take this chunk and pass to a question answering model right so we want to make sure our contexts are actually small small chunks that we want to write again the other thing is that maybe to answer something you just require a one small paragraph and not all 10 pages right so we want to make sure we have them in a small small chunk so that's what uh this logic is going to do so let's see what's going to happen here so from Lang change just like we had a document loaders we also have a text splitter nothing but they're going to split our documents that what we you know got here and there are multiple you know Splitters are available you can go and check here so I'm going to use something called recursive characteristic splitter what it is actually going to do this thing first it is going to find out you know a paragraphs and then actually collect those paragraphs into one check if if you get the bigger paragraph it is going to collect the lines right something like that so recursive character split so important thing is that this character splitter requires the chunk size you know what will be the chunk size so we're going to put the 500 character of the chunk chunk overlap so for example when you're going to split your text right let's say this part go to the one chunk just go to the other chunk what we can do we can keep some overlapping art you know some part so maybe this happened that you know this line goes to the chunk a also and this line also go to the chunk B also so that we have some kind of overlapping context among it right so that's what it is doing and eventually we use that text splitter and split document which document the same documents that we loaded using the uh the launch in directory loader right so if you say we had a three document but if you split them each of them is a 500 character length and then have some kind of a 20 character overlap ideal it should be like 50 character or something but let's say we're going to get small small 30 documents actually How does each chunk look like so if you look at this you know this this is nothing but one Chunk in our whole document somewhere right so this is nothing but the list of Doc we just look what is the eighth document what we have is small chunk since now we have a chunk we can insert into the pine code this is also covered many times as I told you in the last video the same code but I had explained the open AI embeddings but rather than open AI embeddings you could use the sentence Transformer embedding and you just need to tell which model you want to use right so let's load actually that particular model changes Transformer model and then we're going to actually see the sample embedding and you'll see it has a 384 length so that's what we did last time so we had this thing right if you look at currently it doesn't have any you know sorry it doesn't have any vectors because we haven't you know be deleted we we didn't push anything so you see there are 384 length because we know that we want a 384 right now let's we need a pine cone so that we can Index this document okay so pine cone once we have we're going to import the pine cone and then line can also support the many Vector databases so you can go and check all the vector stores it supports actually so we're going to enter you know import the pine cone because then it has a method that you know you can create the you know you can update that index actually from the documents we just created on the above right so first of all we need to init our pine cone you know account so we require going to key and some environments if you go to the pine cone again covered in many videos whatever the project you are there if you go to the API Keys you will have your own API keys and in the indexes when you create the indexes you will be creating you know when we created the indexes actually we created in some kind of environment so you need this environment like us is AWS one so you can copy this thing right so we record these two things we have it I just already copied the links in chatbot is the name of the Ring now if you see we don't have any documents here or I would say the vectors here because we haven't pushed anything there right this index is empty zero vectors now let's push something there so now this is the code the pine cone Vector store from documents which we created we give an embedding logic or sorry the embedding instance so that it can create embedding of this document and then eventually store in this particular index so run these things and we should you know see those number of vectors how many vectors we should see how many chunks we had we had a 30 chunks so ideally we should see a 30 vectors here now you see we got those 30 vectors here uh you know updated now we have our knowledge base or vector database ready right we can even so here is the sample function uh that we can use right so uh this index the same index we just updated right it has a function called similarities such with the score or the similarity search with let's say you know you can pass how many uh semantically matching documents we want so K is equal to one will return only one document matching so we have our query how is EBS economy and we're going to call it and we got only one document because you can see we have only one but if you could do you know two then let's say we should get the two matching document with the India's economy so if I search page content actually and yeah you could see we and we are getting actually two page but I think one is fine you can do two if you want you know for testing per clause or something right so this is how it works that you can Index this document and then now you can find what are the chunks related to your query and that's what our chatbot is going to do right now enough let's go to now stream lead code because this is where we're going to uh use this particular index our streamed it was you know already running so this is let's go to this is our main file right now let's see what exactly we are importing as I told you we're going to create you know Lang chain chat bot so rather than using plain open Ai and that logic we're going to use the Lang chain has a chat model nothing but the open AI chat because it supports the other things also launching of the concept of chain we will see we're going to use the conversation chain and then it also has a conversation memory and we're going to use this conversation buffered memory we will see what exactly this conversation of our window memory and then we require certain prompts we require system prompt human prompt and what exactly this uh prompt again system human and chat if you could remember uh you know how does the let's say the chat model which is the open AI GPD 3.5 how does it have it has a one role called system role which is nothing but your overall instruction which is going to set a behavior of your you know your Bot it will be something like hey you are a helpful assistant or something like that or it could be something hate to answer based on the given context those all our information goes here system message then this is what the user query it could be your actual context and the user query like how's India's economy that's going to go here and the assistant is nothing but the message that is coming out of you know bot or GPT so those are all prompt templates we require so we require system message human message Chat and all this thing right anyway and for the UI we're going to use this stream lead chat first thing the title what you see on the you know UI so this is what the you know Lang change chatbot whatever thing you had it right and then we're going to require two things uh in our session variable all the responses we got from the bot and all the queries which is nothing with the request that user has said so we're going to have two lists actually a list of responses list of why do we require because the streamlined chat right we can use this all these responses one by one and we can display beautifully here so if you see here this this display is actually coming from that streamlit chat Library so it takes this is the user responses a list of user responses and these are the list of you know sorry these are list of Bot responses and these are user queries and if you want to know more about streamlit chat you can go here and you can see right this is how the UI looks like we will go you know in the code and you will see chat model you know we're going to use GPT 3.5 turbo you can put gpt4 also we require an opening I key I have my key you know display here I'm going to delete it after this particular video and then we're going to store one more session uh you know variable in the Stream lead session which is nothing but the buffer memory so as I told you see the bot needs to have some kind of a memory right it's not the question answering so question answering means I ask about something it answered me right but it can't answer these follow-up questions like you know this because this is not sufficient so we require some memory for a chat and we know that chat uh you know chat GPD does has a memory this whole messages is nothing but the history of conversation right so Langston has a data structure to maintain this conversation so that's what you know we're going to actually um store that session variable called buffer memory and that is nothing going to hold the conversation buffer window memory now let's see what exactly the conversation Pub for window memory so if we have actually written an in-depth blog on the you know uh our blog so let me show you so this is like a future smart AI blog where we wrote an in-depth you know lunch in memory where we have covered all the memories that you know Lang chain has and you know how they work kind of things overall first of all you know uh what you see here this is nothing but let's say some kind of a you know buffer memory that you can have but if you keep adding all this conversation into the buffer then one time will come that it will exceed the token limit of your chant GPD because you are sending this conversation history to the chat GPT right so there are different uh you know options you have then then rather what you can use you can use a conversation summary memory so what it will do rather than keeping all this conversation back and forth between the user and God it will summarize it and then it will keep at the summarized memory right but you know that the moment you summarize something you're going to lose some information right then there is one uh option is called conversation buffer window memory so rather than maintaining all the conversation what if you could maintain only the recent three or four conversation that is nothing but the window listen to conversation recent three conversation right so you can go read more about the conversation what we're going to use we're going to use this conversation buffer window memory right because we want to just make sure that you know we are able to maintain the last three messages that's it you know and because we're going to have only the follow-up questions up to the last one or two three messages and so you have other choices here that you could use go and explore those you know conversation memory okay now what we require we require a system prompt we need some behavioral instruction to work out right which says that hey answer the question as truthfully as possible you know given provided context very simple we use all this thing then we have one template prompt template for you know so if you're not familiar with what is the prompt template so template is nothing but you know where you can substitute some value inside it right so the human message prompt template is for our human message so we have a bot we have a human right so it is going to take some template and then our human message template is simply whatever our input variable that's what we are saying that it is going to take a variable called input okay and that's what the human message then we have a chart prompt template which is nothing but going to combine all our other template it is going to compile uh sorry uh contain our system template which is nothing but the instruction what we have then this is message placeholder I will tell you what exactly that mean and then human message right so what exactly the message placeholder so with this system message and the human message we are also inserting something in the chat or the conversation thing is nothing but the conversation history and this message placeholder required to know what kind of a variable is actually you know going to have that information and that variable is history it might be confusing but uh you know this is some GitHub thread where you could see you know we will get an error if we don't pass that particular you know we will get some kind of error like this thing you know that uh you know bot is expecting some kind of a input variable but it is getting history because history is something getting injected by our conversation memory so here you could see that right the conversation chain is actually you know the history actually variable coming from the memory what we have right so it requires that placeholder you can read more about what exactly different templates and all this thing you can go read on the you know the line chain actually website all of this information that what exactly you know chat prompt templates and what exactly the message placeholder but for Simplicity just think of this way right we are creating this chat uh prompt template which is nothing but kind of making the behavior of this thing we have something for system we have something for user current message and then something which is maintaining the history which is nothing but coming from our conversation memory that's what it has and then eventually the conversation chain so what is in the land chain what is the chain chain is nothing but putting multiple components together so here in line chain is you know allowing us to put our memory variable now this is actually the memory what that it is going to pass that history variable then we require language model which is going to answer our question and then the prompt tablet right so chain is nothing but combining multiple of this components right memory template and the language model right and we have a conversation chain we also have a simple chain like a question answering chain that we had to use the earlier video now we got the chain which is nothing but all the things now we can interact with the chain so now some basic extremely logic we have we're going to use some container to hold all the information related to let's say query user has and then all there is one container for response thing where we are actually rendering the messages right so see what's happening let's go back to the UI again not here pair is c u i okay it's here so we have some kind of a placeholder a text area where you know user can provide an input so that we are storing in the variable called query right and then we are getting some kind of a conversational string for now just ignore I will tell you what that conversation string we require we require this for the query refiner I will tell in detail what's exactly that query refiner is doing I think I told earlier right it's such a refining query but you can simply think that for Simplicity we take the query and you know we we pass that query let's say refine query is equal to our same query and we pass to find the match which is nothing but our pine cone indexing we are passing that query right we embed that uh query so that we get the vector and then we find the matching two documents and that's what it is returning here as a context those matching two documents and eventually our conversation chain will predict right it is going to take input variable right the conversation chain is going to take a variable called input because this we need to pass right and so we're going to pass our context and our query so context is two documents which is coming from our knowledge base product store and the query that we're going to pass and then eventually this line chain will you know give us the response which is internally using chart GPT now once we have the query we also got the response we want to make sure we store them in appropriate session variable and that's what we're going to display here so we are just iterating through it and we are displaying if you look at we are using that message same stream read chat message function to display our message the only thing is that sometimes we have user is equal to True which is nothing but the request and the responses we don't have and that's how it is able to decide where to align so when user is equal to true it's aligning it here and when it is not mentioned it is aligning its support so we got this thing right now let's say what exactly this you know thing is happening here this conversation string right so maybe I can on this thing you know conversation string let's see what exactly that information the conversation string now let me uh we can we can refresh this thing um and you know let's drag in and see what that conversation thing is doing so when we ask how is the India's economy okay what's happening let's see we'll see even that's a single thing it has got to refined this is not necessary when you have only the single thing right okay so it is saying the India's economy is you know 3 trillion or something now this must be coming from our document so if I go to India and search something related to economy foreign [Music] it might not be the correct this is actually coming from the document right so that's what we want to make sure so this is what the answer we are getting and we are here some placeholder but not nothing right now let's say we ask and its culture now you'll see uh this is where we actually going to show that conversation nothing but that same conversation in a string format because I require that string format to create my you know refined query so now you see when we had how is India's economy then about answer something then I asked and its culture taking this all information and my query got refined into the you know India's economy I I and its cultural you know that kind of thing so what we are doing we are actually using one prompt which prompt takes this you know recent conversation and current query and transform my current query into the full refine query that can be used for semantic search and the typical example let's say I have this uh you know GP detail prompt we say that hey I'm going to give you the conversation log and the recent query and you need to convert that into the more refined query that is the thing so I'm giving my conversation log so if you see user says hi bot says hi then user asks here you know hey what documents are required for the private limited register situation and then let's say bot answers something we don't care what it answer but then our user asks again query how much does it cost now what cost it is talking about this query is not sufficient let's say I have some document which talks about the private limited registration cost but if I use only this thing how much does it cost it will not be able to answer whether it is going to car you know find the context where how much does the car cost mobile cost or private limited registration cost right so we want to make sure this recent query needs to be refined using my previous information so that's what this refined prompt does see it actually uh converted into the more refined query saying that what are the cost associated with the private register company now this is sufficient now I can take this thing and match against any of my documents if I have anything related to the private limited registration right same thing happening here actually so if you go here let's say we pick up this conversation here you know this is nothing about our conversation log the human and the plot and then you know this is my recent query let me take this recent query and put it here where is it yeah this is my recent query and then kind of I'm asking hey can you generate the refined query that I can I cannot now it is say hey what is the role of culture in India what is the role of culture in India's economy that refined query looks better than this 3.5 or is it no it is also gpt3 sorry it is also gpd3 so let's see what is our logic to get that it is the same prompt we have implemented right so if you see this conversation string is nothing but how we are getting conversation string we know that we are storing all our responses and the user request in this session variable right I'm just iterating through it and just creating one string variable which will have a plain text you know conversation and the same where I have this query refiner I take that same conversation string I take the current user query and I am using that same prompt asking hey based on this conversation and this current query generate the refine query and that's the refined query you see on the uh you know UI this is how actually it is working and this is very important when you use the chat DPT UI there you might not bother about this thing right because but when you want to use the pine cone or your own you know Vector search semantics such you want to make sure that this query converts into the something more meaningful query that query can be then again matched against the document so right here is our conversation we got and then finally it refine the query saying that hey what's India's you know relationship with the United States and this refined query is actually going to the Pine concept you see we pass this refined query to actually find the context so this is how this code I know I couldn't explain everything you know in tune but then it will take a lot of time explaining every line chain concept every Pine called concept the good thing is that you're going to have this code right so I upload code for every video in the you'll have a description link and and there are already many blogs we have written now this blog is totally dedicated to you know just only the link chain memory it covers all the memories now we have one block related to the Langston summarizer dealing with all the in-depth thing right so you can always go and read whole in-depth blocks what we have written on the future smart Air blog right you could even see we have covered uh uh the earlier blocks even this is like a language in series within Lang chain q a bot using open a embedding so all sort of different things we have covered actually in the blocks that is also you can go and check and you know as I told I have covered those things in a video only thing is that I combine now actually two three videos to make sure we have a pot which has let's say the conversation UI coming from streamlit chat we are using the conversational capability which is coming from chat GPT right but we want to make sure it answer from our own document so that we have control over the answer and we also want to make sure we are able to answer this follow questions and to get the follow-up question answered to find the relevant context with respect to the follow-up questions we need to refine the query and we have used a different prompt to refine the query I think I I had the similar idea but I got this idea somewhere in maybe the pine cone I think documents or again there one of their tutorial or something but I had come up with this similar idea actually I did experiment a lot of with this uh kind of thing right so I hope you found this video useful you go check the code you have a relevant links go to those links read about them right and as always you can always comment you know below the uh my YouTube video and I will try to you know answer those doubts okay thank you very much
Info
Channel: Pradip Nichite
Views: 32,300
Rating: undefined out of 5
Keywords: langchain, chatbot building with chatgpt, chatbot, streamlit python, langchain in python, langchain ai, streamlit, streamlit tutorial, streamlit dashboard, langchain tutorial, chatbot tutorial, chatbot gpt, streamlit beginner tutorial, chatbot using python, streamlit session state, langchain demo, streamlit machine learning app, chatbot python, langchain openai, train openai with own data, using chatgpt to build a chatbot, langchain explained
Id: nAKhxQ3hcMA
Channel Id: undefined
Length: 30min 22sec (1822 seconds)
Published: Sun May 21 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.