ChatCSV App: Chat with CSV files using LangChain and Llama 2

Video Statistics and Information

Video

Captions Word Cloud

Captions

hello everyone in this video tutorial we will create a chatbot to chat with our CSV files using lama2 and Lang chain so here is the complete architecture so let's first look at the architecture and then we will move towards the code part so in the first step number one so if you just uh if I just zoom over in over here so in the step number one the user will upload a CSV file the user can upload multiple CSV files or a single CSV file so in the first step the user uploads a CSV file then in the step number two we will extract the data or content from the CSP file so after the user uploads a CSV file in the step number two we will extract all the data in that CSV file so now we have extracted all the data from that CSV file now we cannot password drag that directly to the lava 2 model because the lamba 2 model has an input token limit the input token limit of gamma 2 model is 4096 local and one token is equal to four English characters so it is about it is approximately 16 000 English characters so we cannot pass more than 16 000 English characters at the input of gamma 2 model but we can say that if we have very huge CSV file a very very huge CSV file that CSV file can have more than 16 000 English characters so we cannot pass that all the data directly into the gamma 2 model so what we do is basically split our data into small number of chunks okay so here after extracting all the data we split the data into multiple small chunks like 10 20 and we Define that in each Chunk we will have 500 characters or thousand characters so we will just set the chunk limit as 500 English characters to be maximum then after splitting our data into multiple chunks we create embeddings for each of the text chunks so embeddings are back basically vectors that are reduced to compress the size of text chunks so to reduce the size of the text chunks we create convert each of the text Chunk into a beddings and bendings are basically vectors like this 0.1 0.20.3 so embeddings are basically vectors that are used to reduce the size of the text jumps so we will create a variance for each of the text chunks so if we have 20 text Chunk we will create embeddings for each optic text Chunk then we will build a cementing address and using cementing index you will store our embeddings into our knowledge base so uh you can say that we will store our bendings into a database or vector storage so we will create a vector database we will store our all our embeddings so there are many vectors so in this project we will be using twice as our Vector store so files toward the embeddings locally while pine cone stores the bearings on the ground each of the vector Stone has its own advantages or disadvantages or some limitations so in this tutorial we will be looking how we can see more embeddings in Vector files Vector database so now I you can say that in our knowledge base or vector database our embeddings will be saved in this format okay so now what we can do after we have saved combinings when the users ask a question for example I am a user I am just asking question when the user asks a question we convert the question into embeddings then we do a semantic search like for example this is the comparence for our question then we do a semantic search we try to find all the relevant answers for the question that the user has asked then we try to rank these results for example I want to rank top three results or top five results so after we Rank and the usually we rank top three results of the question that we have found in our Vector database so after ranking all the results we send the results to the nama2 model and we send the question direct to a user question to the nama2 model directly as well and then the lamba 2 model generates a natural response so this is how the architecture looks like so let's of course that word part okay you can see that over here I have created a new project a new project over here in my Pi Champion Community Edition so I will just go to file over no I will just go to over here and create a new requirements.txt file okay in the requirements.txt file I will list all the packages that are required for this project so we require a rank chain period prawns farmers sentence Transformers uh basically people implementing the lamba 2 model in GG ml format so C Transformers provide bindings for the models that are implemented the CGM and format then we are basically 13 weddings or we you can say that we are using sentence to offer send press Transformers embedding to require the sentence Transformers package as well along with this uh with a Wi-Fi CPU as we are using it as our Vector database so let's install all these packages so you can just run go to terminal and you can just write pipe install minus r requirements.txt so this is one way in which you can store the packages other ways you can just go over here go to settings then you can just go to project and then you can just click on python interpreter and then you can just click over here and just write five CPU and in this way you can install each of the package one by one like you can see that it's type just click on install packet so the package installation has started now along with this I can also write length chain so you can see that uh price package installed successfully now you can see the message I can also write blank chain over here okay so you can just click on install package from here along with this you can just write three Transformers you can also install this package as well okay and you can also install the sentence transform perspective okay yeah this is that so in this way you can install all the packages I have shown you both the approaches in which you can install these packages okay so you can just click over here and close this up from here so now that you can see the packages are getting installed so now we can just go over here and create a script.pi file over here okay so we are good to go so now first of all I just need to create a directory by the name models in which I will place my Nama 2 model so I will just go over here and right models I'm just running about uh the descript on my local CPU which has 8GB of RAM and it is not a very good CPU but it will work fine so now in this model directly I will place the lamba 2 model or to download the nama2 model you can just go to hugging face and the blue Nama 27 billion parameters and gtml format so here we are using gamma to model which is a 27 billion parameters and it is in the ggml format so basically to run the number two model in CPU we require an optimized number to model and ggml model that basically optimized so that they can run easily on the CPU okay usually lamma to model I require a huge amount of memory as well as it it is basically uh it requires a huge amount of memory plus it to provide a very fast system as well so if you don't have any fast GPU or if you want to uh it to run on your local CPU then you can use the number to model optimize with gtml format and then you can just go below from here so now you can see that we have different models if you just click on file and versions from here so you can see over here we have different model our available over here so in this tutorial I will be using this model gamma 2 model with 7 million parameters and this in this bin format over here okay so as I have UTP of ram in my system available I will be using this model because it consumes maximum 6.29 GB Ram but if you have a 16 GB Ram in your system then you can use this model uh a bit 8-bit quantization like you can see over here previous was four bit quantization so this is an 8-bit quantization model and it consumes maximum RAM of 9.66 GB so this is an 8-bit quantized model and in this tutorial I will be doing this 4-bit quantize models because it consumes 6.29 GB maximum RAM and if you want to download this model locally you can just go to files and versions from here okay and here you can just search for the model so you can download this model if you have a 8GB Ram in your system by clicking on download and if you have a 16 GB RAM on your system you can just download this pin model over here by just clicking over here okay plus uh the CSP data which I will be using in this tutorial will be this data from kaggle so this is the one happiness report data and I will be using this 2019.csp data which is available on keger so I will just show you some data details uh as I go towards the implementation but this is the data which I will be showing you so this is the one happiness record data available on kaggle you can just create your account on Kegel and easily download this data from here from the kegel.com so foreign so it's quite simple you can just click on download from here and you will be able to download that data set so let's move towards the implementation part so if you just see over here I have already placed this number to quantized dgml model with work with quantization DOT main file into my models folder okay so that's so good uh what thing other thing I can do is I can just create a new directory over here by the name data in the data directory I will paste my uh model over here okay put my data set over here so I will just create a new directory by the name data so let me just add data set over here and let me show you the data as well this is the data which I have downloaded from kaggle so you can see that we have the country names or regions name from here we have around 154 country names which you can see over here or sorry 156 we have plus we have the gdps with gbd GDP per capita of each country's social sport happy life expectancy value freedom to make Life Choices generosity perception of corruption values over here like you can see this values over here like for all this 156 countries so we will be using this data and we will be uh bring the chat with our matter a model to model considering this data as well so let's move towards the fourth part and let's start writing the code okay so first of all I will just import all the required libraries so instead of uh importing all the required libraries let me import all of them and try explain you uh the complete code so instead of I just write the complete mode over here let me import all these required libraries so first of all you can see over here so now you can see over here I have imported all the required libraries so the first library is CSV loader so we require this Library so that we can load our CSV file then we have recursive character text beta Library so we use this library to split our text into chunks as I told you in the start that we need to split our text into small chunks so as I told you at the start so we'll use CSV loader library to load the CSP file then we have and then we after extracting the data from that CSV file then we split our text into chunks the data we have in the CSP file which is split into text junks so to split the data into small number of text chunks we require the currency character text builder then we will download hugging phase embedding so that will create embeddings for each of the text chunks so you can remove this file Pinecone we are not using pine cone as our Vector database so here we are using files as a one vector database where we will store all our embeddings and as we have the Nama 2 model in ggm format so uh to use the model in gtml format we require C Transformers Library C Transformer Library provide python warnings for the models that are implemented in the ggml format then we will be using conversation retrieval chain so conversation retrieval change comes with a memory component so that to add memory in our top in our chat board we require conversation buffer memory so what is the advantage of adding memory so for example if I ask my chat board a question who won the 1983 World Cup so I just clearly mentioned who won the 1983 World Cup so 1983 World Cup Was Won by West Indies so if I ask the second question who was the captain of that team so I don't mention in the second question that who was the captain of the West Indies team that won the 1983 World Cup what my chatbot will still answer that Clive Lord was the captain of the West Indies team that won the 1983 World Cup so if we had uh the memory component to in our chat what our chat would keep tracks the previous conversation with history then you can just remove this then we have import OS uh we don't require this as well and we will use this Library if you want to exit from our chat board so this is on the on the all all the required libraries that are required in this tutorial so now I will just uh load the all the csb loaded so that we can load the CSV file okay so now here you can see that I'm just calling the CSP loader over here and inside my data folder I have this 2019.csp file and here I've just defined the importings and CSV arguments you can skip this as well so now we have uh pass our CSV file path so now we want to extract the data from that CSV file so we can simply write over here data is equal to loader.load so if we just try it over here this over here we will be able to extract the data from our CSP file so let's run this and let me show you how you can extract the data from your files if I just write python skip dot file and let's see if we are able to extract the data from that CSV file or not you can see over here we have unable to extract all the data from our csuite file so if we just start over here I have to run this script over here so country of region was the first column then we have Finland score GDP per capita so we have extracted all the data from that CSP file so that's look amazing so after we extract the data from that CSV file so we will just go back to that lecture so after extracting all the data from that CSC file we will print the data into text chunks then we'll create abandons and then we will store the embeddings into our knowledge base so let's Implement these steps so now pages will be low down so now after extracting all the data from that CSV file which you can see over here we will split our text into text chunks so we are using recursive text video to split our text into small number of chunks and each chunk will contain maximum 500 English characters and there will be a chunk overlap of 20 characters so let's split our data into text now chunks so we will see in how many text chunks we get with 500 English factors as maximum well now you can see that we have 156 text chunks so after splitting the complete data into small text chunks contain maximum i1 and English practice each textile we have total number of 156 text chunks so after we split the data into smaller text chunks which contain maximum 500 English collectors we have a total of 156 text chunks which we get so now I will just download the hugging face some bad links and then we will convert our text Chance into weddings and then we will store those abandons into a knowledge base a files Factor database okay so let me just download the embeddings from hubim phase one night here just right download sentence Transformers embedding from okay so now here you can see that we are just downloading sentence Transformers embeddings from hugging face so now I will just toward those uh I will just convert the text chunks into weddings and then we will store those embeddings into Pi's knowledge database Okay so okay so I will just go below down so now uh basically we are converting that text chunks into embeddings so now we are just converting the text chunks into embeddings over here and we are just saving the embedding into price Vector database and let me just create a vector vector database path over here so I will just go over there and create the new vectors uh folder over here okay and over here okay so this is our part where we just came over the weddings so I will just write over here Vector store DB Dash twice over here okay so over here and now I will just write over here okay so our I've already added this name over here so our embeddings will be saved into this folder over here okay so let's run this and see if our embeddings are saved into this folder or not so let me just check so this might take few seconds before we get the answer so our embeddings will be saved into this folder if it runs successfully okay so in the mean by electrons let's move towards the next part as well so now uh if we will still store our embeddings into twice not Vector database and we will create this Vector database locally uh in The Next Step the user asks a question and then we do a semantic search and we then we rank the top three answers okay so let's look at this step as well okay so now it's the done so now you can see that our we have converted our text chunks you to embeddings and we have saved the embeddings using twice back to database and here you can see the dot pkl and Dot files file of our embeddings as well so that's look good so now let's uh if the user asks a question over here what is the value of GDP per capita of Finland provided in the data okay so if the user asks a question what is the gbd what is the value of GDP per capita of fill net providing the data so next what I will do is I will write docs is equal to Doc search dot simulate that's Robert and here I will just pass the query and we will rank the top three results and here I will just print the answer or the top results which we get so I will just write print to the tops so let's run this so what we are doing over here is when the user asks a question then we will create a beddings for that question and then we are doing semantic search and then we will rank top three answers over here okay so this is what we are doing over here we are just ranking top three answers you can rank top five answers as well and you can see that it's uh progressing currently so meanwhile it runs let us Define our gamma 2 model over here and as we have the model in jgml format so we will be using uh C transform Library C Transformers library that provide bindings for the models that are implemented in the gtml format so here you can see that we this is the rank number one so this is the answer which we get and so you can see over here so in this way we have ranked our answers over here so we are just getting top three answers over here so let's call the number to model over here okay so we have set the temperature value 0 to uh pressure value 0.1 the temperature value varies from zero to one so if you just want your model to be very creative you can set that temperature value by but if you want to water to be deterministic and just focus on the data to generate the response then you can set the temperature value zero so in case of uh generating some code we just write temperature value high but if you are doing some log or article generation then we just set a temperature value low okay so next what we will do is we have defined our gamma 2 model over here which is inside this folder models folder and we have set the maximum number tokens like maximum uh English characters which we get in the output is around 2000 English characters because one token is equal to four English characters so let's have a chat with our chat board so I will just go over here and add this so now here you can see that we are just adding chat chat history component over here uh I have just missed one thing over here so let me just add this so here we are just using conversational retrieval chain over here okay so you can just find this out here as well so conversation retrieval chain is built on retrieval chain okay so we have already have retrievable chain in uh we have a word on the retrieval chain in our previous projects as well so the conversation retrieval chain is built on retrieval chain which provides a memory component as well so if you just want to learn about more about conversation retrieval chain you can just go over here and just write we will change over here then just click on the first name which you get okay so now you can say that the conversation retrieval QA chain is based on retrieval to a chain to provide a chat history component so it can keep track of the previous chat history and all these details you can just review over here and we will go into more details as we come in the in our upcoming projects as well okay and here we are just passing by nama2 model which is over here and here I am just passing my embeddings over here okay so we have just converted our text chunks into bedding so here I'm just passing the embeddings over here and let's have a chat so as we are using conversation retrieval chain conversation retrieval chain is built on retrieval QA chain which provide a chat history component so here we have added chat history over here and you can see over here and here the user asks a question if a user writes exist the it will the system will exist or if the user whenever the user asks in question it will generate a response so let's run this script now here and you can ask let's have a chat with our chat board which we have built on this CSV data but this might take few seconds before it starts and then have a chat with our chat mode which we have built on our CSV data so this might take few more seconds before we have insert but let me ask a first question over here like you can see we have the input prompt what is the GDP per capita of Finland provided that what is the value of GDP per capita of Finland provided in the yeah so this is the first question which I have asked so it might take one minute to generate the response and let's see what response do we get here you can see that we get the response the value of GDP per capita of Finland provided in the data is 1.340 let us see if this is correct so we can just open the Excel file over here on all you can see over here in Finland the GDP per capita is 1.340 so next let's ask another question what is the happy life expectancy value in Afghanistan from the data provided okay so let's ask this question what is the value or helping you will learn thanks and see in Afghanistan provided so this is the second question which I have asked what is the value of happy life expectancy in Afghanistan provided in the data let's see what response do we get so here we get the response according to the demand data the happy life expectancy in Afghanistan is 0.361 so if we just compare with our Excel file so we can see over here it's 0.361 so we have got the correct answer so in this way you can continue the further chat and that's all from this tutorial thank you for watching have a great day ahead bye bye

Info

Channel: Muhammad Moin

Views: 5,286

Rating: undefined out of 5

Keywords: llama 2, llama, langchain, chatcsv, csv, chatbot, large language models, generative ai, generative models, deep learning

Id: PvsMg6jFs8E

Channel Id: undefined

Length: 28min 32sec (1712 seconds)

Published: Wed Sep 06 2023