LLAMA2 70b-chat Multiple Documents Chatbot with Langchain & Streamlit |All OPEN SOURCE|Replicate API

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

everyone in this video we are going to build a chatbot that is a multi document chatbot that enables you to upload any type of document it can be a text a PDF it can be a Word document into the application and you can just start chatting with them asking questions about your document so we're going to be using that during um doing that using a replicate that is a platform that enables us to run models in a cloud so you don't have to download the model and we are also going to do that using um the downloaded version of llama2 the quantite version all right so I'm going to show you step by step how to build this is going to be some quite a long video but we are going to work through that step by step and I hope you get something useful from that right so it is I've spoken half a memory meaning that you it's going to store history from the chat okay so let's see how it works right so I'm going to browse and get some documents in so these are documents this PDF and a text file all right so let's see right so this is the document um dealing with Statistics PDF so I'm going to be asking questions on that I'm going to be asking questions on this one some Walmart document and up also uploaded a code in there right it's extremely cold right so let's see so I'm going to say let me start asking questions right so let me say what what is a linear regression hello I'm gonna hit send this way for the response all right and there it is it's giving us uh some response which is quite accurate and these are great so a linear regression model is the status card model that is used to predict a value of your continuous outcome variable based on one or more predictable variables okay awesome so let me just ask a question about my code and let me say what is the stream let's code about right so tell me about distribute code all right I'm gonna hit the send all right so it is a streamer the streamlined code you provided is a chatbot that using open resource app class from the embed chain library to interact with the user and retrieve information from the internet I really like this um aspect of it so what I was trying to do is it is a streamlined code so it's using embed chain right so it is really able to explain in details of your code it gives detailed information about your code as well all right it looks pretty accurate and we are going to build this chatbot step by step okay all right so let me walk you through the architecture of the um architecture of the app all right so the we are going to be uploading um PDFs texts documents so whatever document you have we're going to be uploading them so we're going to extract the content from that using the character splitter text splitter and we I'm going to split um split the split that into chunks okay so we are going to split a bit of the um extracted data into chunks so we're gonna have chunk one the next chance the next chance so depend on the sizes of the chunks right and for each of the chunks we are going to create embeddance okay right so what we are going to use for the embedding is going to be sentence Transformer from hanging face all right so we are going to be using that to create a variance and we are going to be storing that in a knowledge base so when the embedding is created it's great understanding using the semantic index meaning that we are going to store that in a knowledge base all right so that when the user here asks a question the question creates an annoying burden right just like this one the embroidance here and we is going to create understanding of semantic search I'm creating um then a case into the knowledge base right so try to find similarities in here right and it's able to pull the answers and get into the it starts ranking the results and it gets into the llm the answers okay and it gives the response to the user all right so we are using element 2 as the llm here right and we are going to be using the um the quantite version and we are also going to use the 70 P parameters 70 billion parameters all right okay so let's get into that right okay so let's get into some code here all right so I'm going to walk you through the uh setup right I'm going to type the link in the description below so you can grab that okay so in here I have my environment set up let me just open a new terminal all right and so if you want to set up your environment you can test in your terminal you say python and when it hits enter it's going to open a new create a new environment for your project just like this one which is very useful if you have to share because it's going to be in one place I'm going to capture it with your colleagues all right and now we are going to get access to the EMB so we got because we're going to be using the API token we need to get access to the replicate so I'm going to show you how that is done all right so if you all right so just um an explanation to replicate it is a platform that makes it easy to run machine learning models in a cloud from your own code so you can run your open resource model like the Llama 2 model you can run you can run it or you can run your own public or private models using that okay all right so with this one let's see how it works okay all right all right so when you get in here what do you do now we will need to set up our account okay all right so in here you can go to sign in right you can sign in if you have a GitHub or you can create a new account so I have a GitHub account which is connected so I say sign in with GitHub okay and when you get here you click in here and you get your API key token right and when you set it up you just can create a name in here okay all right and let me say data projects let's say correct in inside projects you say create and copy that and then I'll come in here I'm gonna paste it here okay all right okay so if you let me just send this across so you can use the replicate for free but after admit you'll be asked to enter your credit card all right so depending on your use cases and your CPU you can see how the charges are right so you have the public model price the private model if you want to use it privately publicly you see the charges right here okay you can use the ROM that is eight gigabytes from so depending on your GPU your CPU right it has some billing charges here too okay but you can use it for free when you start right okay so let's get into it all right so okay and let's see here um these are the requirements that is going to help us through the project you have to Unchained so these are the packages that we're going to install right so within your terminal you can type be installed which are requirements X so when you hit enter it's going to install every packages here okay have that installed so I don't need that to do that again okay all right all right so let's get into some code here okay so we import scriblets to create the app the important we get message important message to create a conversation he a conversational debugging to get the answers and questions all right the hugging face embedding to create a burden so we're going to be using Center Transformers right that is uh all right so the sentence Transformer is um really good at clustering and doing semantic searches okay so we are going to use that to create our beddings okay and all right so replicate we are going to use that to create the llms and I'm going to do c Transformers if you want to do the CPU if you want to use the CPU and download that locally right the character explainer to display the text phase phase is going to do the um store the to the vector store right it's going to be our knowledge base all right so vectors to um the face is a Facebook AI similarity search phase and it's a library for efficiency similarity search in classroom of tense factors so yeah it's pretty good at that so we have other options like the chroma DB right so we are going to use the face for this project all right and we are going to do the conversation perform memory for the memory storage so we're going to be storing the past conversation the pi PDF loader we are going to create the um for PDF documents text documents text loader and docs to text like Word documents and I'm going to be doing some streaming of the conversations okay and we are going to view both OS so from dot um the operating systems we're going to be using that because we're creating some path and also from dot EMB we are going to load the dot EMB so we're going to be loading our API token right so when you use the load dot EMV it means that it is going to look for our API key access okay into so we are going to be using the replicate to do that all right so as soon as we call it it means that it's going to look for the API token and we are going to import temp files okay because we are going to be working with creating with the files and working with uh um I'm storing the files in a temporary file okay right so what we want to do now is right so let's start here so let's say create let's create a function let's call it the main okay so in that let's create streamlet let's say title create a title right let's create a title let's name it multi shots support using number two put that into do some books put some books in there number two right and let's uh let me do a sidebar SD dot sidebar title and say document command processing okay all right first let me just close this make sure that everything in there will be working and then let me put that into a right so now let's do upload it uploaded files is equal to USD so sidebar file so we are going to upload the files they uploaded files loaded files so we want to accept multiple files okay uh accepts let's do it again and we're gonna do that through control and accept multiple files so we are saying that if uploaded files do we want to go with that and we want to create a text I'm going to create a text with a list so we are going to install all the documents the files in the list called index Builder clouds to the files so when you create a file extension extension the EOS path s I'm going to spray the text okay so I want to split the text the name of the file from the extension right so we are going to say one okay so with file dots name so we are creating a temp file equal to false because you don't want the temperature to be deleted you are going to call the term file a stem file let's file right we are going to write it right before so I'm going to read it I'm going to write it into the term file foreign [Music] well extension it's equal to p is equal to PDF what do we want to do it's a loader to go to Pi we are going to use the buy PDF loader I'm going to load the file from the M12 path okay L if if file extension if they fall attention then it's equal to X oh extension is equal to book that we want to do I want to load it the loader is equal to dock loader and file part all right so L if file extension is equal to X all right so we want to load that as well so loader is equal to X to repeat the text gonna load using the text loader to unload it from the M file pack okay all right so if Luna so if we have the clouds loaded so if you know that then we are going to extend that into the text all right so we say extend Florida load and we want to remove you want to remove the term file part okay so after loading everything you want to remove the 10 pop-up okay all right so now what we want to do is we want to split the text splinter all right so we will say character expleta excuse me I'm going to support that using n and a chunk size and size let's go to a thousand so drunk overlap to go 2000. 100 okay and the length function is equal to n right so we are going to split the text using the character splitter all right and now we are going to create chunks so as of now we have the files split them into chunks we are here all right so we are splitting the um document to chance okay so we say text Chance go to text splitter splits we say document X let's put the embeddings create embeddance so we're going to bedding patterns is equal to using the hanging face okay it's a model name is in Transformer so let's go in there run the name paste it here all right go down below cards to go to ice so yeah I use this CPU oh you do a CPU so the next thing is let's create a vector store okay pretty like that oh let's go for that oh hair is in the face um document look for each of the addictions we are creating a burdens is equal to a burden right so we will need to create the chain object okay so let's create it here you need to create a conversation Gene let's create a function so I have create conversation [Music] it's great conversational okay for that you're going to pass the vector store all right so let's load a booking all right so let's say it creates Ellen so let's load that we needed C Transformers okay so we are doing the replicate first all right So within that we are doing a screaming so we want the outcome once it's generated if you want it to be displayed okay so the model is equal to right so let's choose the model for that so when we get in here all right so get in here let's type llama to him okay so as you can see there are a lot of models in here right so we are using the number 270 billion click on that click on the API so now we copy everything in here all right so come in here paste everything here so having tea [Music] okay all right uh all right so the next thing is I Square the Callback for the screaming go back let's say go to film right Let's do an inputs so the input temperature I'm gonna screw that into the very charm let's do that you can just do it one of our numbers and use the temperature controls the randomness of the output so max length and for that 500 Max land up to the edge we have one babies right and let's put a memory let's put a memory it's a conversational [Music] apart from memory it's a Memory key is equal to charging the screen I'm just going to put in some memory okay let's say return search page is equal to true right so now let's create a is equal to conversational [Music] to watching see from llm llm is equal to in a chain is equal to we're doing stuff and retriever so go to breakfast all dots as retriever I'm going to put that into research search box is equal to a two right so we weren't passing the creating a chain if you want to pass in the llm I'm passing the vector retriever so the vector store all right so we're going to get our results from the rectum stall and we are going to do the top two okay all right so let's let's return in the chain okay all right so all right so this creates in objects they create or we pass our great conversational chain right so we call it here okay all right school what you want to do now is uh now let's let's create um initialize our conversation so let's initialize the section state have visualize specialize so if free not a session state so the accommodation is going to be in three fours so we are going to have the history the past and the generated okay so if the history is not interesting state then we are going to say that we want to get an empty list to list and if you read that not he searching state oh no something like I'm gonna say hello all right so the last one is if not a such as the St assisted I'm going to say something like hey all right so now and then we pull that into the main okay I'm gonna plug that into the main so which are like initialize the sexiest thing okay so let's create the the security conversation chain conversation chart okay so this one is going to return the answers from the user so we are passing in the query and the history so results is equal to three and inside the G we want to have the question from the user we're going to call that the query look at the chart history it's a history Dot and the query in their results okay so what we are interested in is the return the answer so let's define a function to display the history that is free but we are passing the chain all right so we are creating two containers we are saying results apply container Phoenix instantiate the containers crypto containers well for their reply one for the normal conversation that's the container all right so it's container that's why we want to put in the container save it you know okay so within the container we create a phone from okay all right so guys it's quite a long video so I'm going to be walking you through the code and explain that to you right by Library do a lot of calling here so we're going to submit the form the user input is equal to usde Dot X inputs slow inside the inputs of the user we say question placeholder is equal to write something like X X about your documents and the key will cause me the input right no clicking the submit button a user's input so that the user can submit the question typing on the send button so we are saying st.com and we are labeling say about this sent all right so when a user click the send it's gonna make sense so if there's submits button submits submit .com you can use our input okay you want to delete the user submissive click on the send button you want to spend all right let's create something like generating okay get the ready to responds when that happens on the outputs you go to the now we are passing the conversation chart how we want to be answers to be okay okay right here [Music] appreciation chart right here we are passing it here and we are inputting the user input and the chain okay that's the section State passing the history as well okay and as well we are going to append the past conversation wrapping the user input to the Past conversation right this is nothing but just how you want the upgrade to the display okay we won't be to be in a conversation form as the section okay generated the outputs okay right and now so we are doing that is for the be generated that is the answers from the from the boat that's from the llm generator provided from that create a reply container and we are going to you see for I in Ridge I'm going to do the lead capacity got a section state of the generated a full message the section state fast okay so we are going to be iterating through the generated I'm going to create a message for the past right just like alteration that is put the user I'm going to assigned out to true right now key let me go to the screen of duration plus with a user the user I'm gonna do our turnstile it could be installed output uh that's down okay Style you go to let me put thumbs it can be any anything you want for the message for a generated that is the one from the parallel the response from the parts say generated okay equal to USD I uh it passed out okay but it's like fun emoji all right okay so now in the main they need to play in the chain okay looks like everything is done you see okay all right so let's see think I should I'm listening something here here I think is here you should add memory according to memory okay right that looks like all right so um let me save it all right so it's how the code Works in here we are importing these packages page streamlined SSD message to create our message and the chat conversationality by Chain to create a conversation to retrieve the answers hanging phase to create embeddings see Transformers to load the model replicate to host the model in the cloud character text splitter to display the text files for our character store compensational platform memory to store the history Pi PDF to grab the PDF files text loader for the text doc to text to for doc files and streaming to create a streaming of the conversation the output OS to create path and the dot EMV to load the API token the term file to create 10 files all right so we are creating a history the initial section State all right so we have the history generated in the past for the conversation so if history is not in a section state if you create an empty list if it is generated not in the section State you say hello ask me anything and if the person not in a sexual State we are saying hey and we create a conversation chart right and we are passing the query the chain in the history so we want to get the results and within the results we are passing a chain and within the chain we are passing the query that is the question in the chat history so that's the history and appending the results in the query to the history and now we want to return the answers right and now we are displaying the chart history we are passing the chain and creating a container within the chart district is how you want that to be displayed we have the reply container in the corner container for the whole application so with the container we are passing the form and when the form is submitted it should clear the question in depth and for the user we are passing the for the text input for the user we are saying question placing a question and say in placeholder ask me about anything if you want to submit that using the send button and if they use so if the user says they click on the same button it shows pin saying generating response and output is equal to the output we are passing the conversational chain here passing the compositional chat and within that we are passing the user input the chain the section history as well and we are appending the user input to the Past section and the output from this you output output from this output we are passing it to the generated okay and before they generated right so if we are getting the generated we are creating a container for the response and we are iterated through the generated and we are getting a message we are creating two conversations we are seeing one for the past amount for the generated all right and we are for the user we are putting the terms and then an emoji so it can be any name that you want to put in there and we are creating a function passing in the vector store actually we are using the replicate we have been a model and calling the streaming and input of zero point the pressure is 0.01 and the max length 500 you can vary that the top percentage one and we are passing in a memory conversation all right they were creating a chain we're passing in the llm and we are passing the vector store for the um storing the knowledge that's the knowledge page and we want to return the top two answers all right and now we are creating the dev Main onto initialize the section we pass the function and we pass in the we pass in the uh create a title for the chat streamlit create app site title bar okay and we want to upload the documents you want to create so if we up if the documents files are uploaded created list converter through the uploaded files and create a path and split the name from the extension and with therefore I've created them file you don't want to delete that and call it then file and we write then read it and we read and write it if I then file and create a path with the file name so we call it a loader because nothing has been loaded we say num so you the file extension is a PDF we use the power PDF loader we load it into a loader if the power extension is a talk X or a dock we load it into the loader using the talks to text if if is a file extension is a text we are going to load it from the temp file path into the loader so if we have a loader it means if the files has been loaded we are going to extend that into the text here okay we are going to extend that in here I'm going to store everything I want to remove the temp file path and now we split it using the character splitter call the chances of 1000 and overlap and LM function to be length and create a text Chunk we split it in the text and we create an embeddance using the Transformers sentence Transformers and we say do it on the CPU machine and we create a Vector store using the face create embedding for each of the text and now we call the chain using the we created object of the chain right here right here okay so the chain is being retained so we just create an object from that and we pass it to the display chart history all right so let's run it and see how that goes all right run it by saying streamlit run app.pi all right we are getting an arrow let's see 95 95 or 95 should be here 95 let me save this crash it let's load some files it's like half an arrow okay call box yeah okay right so let's see correct so let me see what is all right that sounds good okay so let me say one it what it is Walmart let's see okay let me ask another question CEO Walmart the Dodge okay what is is true name okay what is the humid code about [Music] so tell me about it um now what I want to do let me think I want to change the model to 72b chart so let's see explore mama two right let's see yeah I want to use this one we don't like the response getting all right so get here I'm gonna copy all this copy all this paste it here changing the model okay all right so I'm using the 70 billion chart right sometimes you need to try different models and see which one works best [Music] so I'm going to I think that's it or to refresh it all right so now I'm going to say what is the media Christian model let's see Perfect all right okay tell me me about the stream let's frame rate code so getting some original balances now I'm going to provide information as it is not related to the context that may promote harmful right so who is the CEO woman right one it he say I think I like this more of this response yeah all right so same way if you want to try that with the loaded version of the model you can just change the llm all right I'm going to show you how to do that you can just go here you can load your llm here right so you can just change it all right so we can just try it and let me know what you think okay yeah so I hope you like this video [Music] um hope you subscribe share like and see you in the next one

Info

Channel: DataInsightEdge

Views: 14,093

Rating: undefined out of 5

Keywords: advanced chatbot, open-source technologies, Sentence Transformers, embeddings, Faiss CPU, vector storage, Llama 2, Replicate platform, Streamlit library, innovation, AI, Streamlit, generative AI, Replicate llama2 model, Replicate API, Replicate links., advanced, chatbot, open-source, technologies, Faiss, CPU, vector, storage, Llama, Replicate, platform, library, interface, journey, documents, tutorial, DataInsightEdge, cloud

Id: vhghB81vViM

Channel Id: undefined

Length: 70min 20sec (4220 seconds)

Published: Thu Aug 24 2023