Create ChatBot Based On The Data Feed By You - GPT-Index | OpenAI | Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone in this video I will show you how can you create chat bot based on the data feed by you so it is complete responses are completely based on what data is fed to it rather than just some random responses and here we are using openai API along with python so let's get started first of all we need to go ahead and install few of the required modules and the very first one is GPT index so I would say pip install GPT index and this time I am using jupyter notebook inside vs code and that's the reason I'm installing everything in this so let's click on this and it will go ahead and install the required package so this particular module we will be using to build our Vector index so which will be further used for querying our entire knowledge base meanwhile I will type one more thing so the another package is Lang chain so pip install chain so this is the module we are using for accessing our open AI API so this one is installed let me go ahead and install this module also okay so this one is also done next thing is we need to go ahead and import the required libraries so I would first take it from GPT index so let's go ahead and say GPT index and here we need a lot many things so the first one is simple directly simple directory reader and this is used to read our knowledge base the next one we need is GPT list index so let me quickly tap it first this one we are using for indexing our data and then we have GPT simple Vector index okay and this particular thing we are using to load our indexed data and one more thing we uh two more thing which we need is llm predictor and the prompt so llm predicted this will be used to define which llm we are using so llm stands for large language models so we will be using this particular thing and then we have the last one which is for prompt so it is prompt helper so this is mainly used for user prompt like what will be the message size what would be the number of tokens and all those things which we have already seen in my previous videos next thing we need to import is open AI so from Lang chain we will say import open AI okay then we need just few basic ones so Imports this and import OS because we need to deal with local file system as well so till here we are done with our import part next thing is we need to Define our open API key which we will use to store the environment variable so let's create another code block over here and name it as or just save it in the environment and here I would say let's put everything in caps API key and will shortly keep something here so if you're not aware how to grab open API key I can quickly show you that thing also honey.com click on API and it will ask you for login so just go ahead and click on the login button here once it is logged in you can see on the right hand side click on your name and here you can see view API Keys you can go ahead and create new key otherwise if you have already made a note of this existing key then definitely you can go ahead and use it so I have already written a note of it so I'm not going with this one rather let's go ahead and start next thing so now next comes is the knowledge base so we need some data we need some knowledge on which our system can run so here we have few options so either we can go ahead and download the relevant data from relevant website all right or you can also take it from your local machine if you have already stored some data on your local machine so for this particular example I'm taking data from a website which is gutenberg.org so here you can find so many ebooks so you can quickly go ahead and download it so when I have taken it from here and saved it in the text file I have already created a text file and this is having some data which I have taken it from one of the ebooks and the book name is Art of Money Getting and the one thing to remember here is whatever the data you are getting just make sure that you are keeping in some directory in my case I'm you are putting it in the knowledge directory so tomorrow if there are let's say um more than one text file all will be placed in the same folder so you need not to worry about where they are located so if you have huge data I would recommend keep it in multiple files here under the same directory okay so we are done with directory next thing we need to do is we need to go ahead and work on the implementation path so for that we have to write a function which will take I would say which will take care of for creating our Vector index so let me go ahead and add new code block and I'm going to define a function named create Vector index which will take path so this path is nothing but the directory in which our data is present okay first thing we need there are a few parameters which we need to announce here and the max input size so let's keep it four zero nine six then we need to Define how many number of tokens we want to use so here I would take 256 then we need chunk size so chunk size is nothing but how much data you want to grab in one shot so this is 600 and chunk size is very useful when you are dealing with large language model because we cannot go ahead and read the entire data in one shot so that's why we need to Define some size and let's define the another one which is Max chunk overlap and I'm 3220 okay so once these things are defined we need to the very important thing is we need to Define our prompt so for that let's create a variable named prompt helper which will be using prompt helper class and here we need to pass in few things so the first thing is and you can definitely go here and see what are the required expected parameters by using the interlacence so the first one is the max input so you can see the first one is Max input then we need to Define what all number of tokens we need and then we need Max chunk overlap chunk size and I think here we need to Define this as a limit and just say chunk size okay so let me quickly run it because we are not getting all the required enter lessons so I will go ahead and execute this then we should be able to see all the intellisense and it would be easy for us to type also then we need to Define language model so here we will be taking llm predictor like I said before and I will take the variable name same as this so let's turn it to short lower case okay so lln predictor will take what llm model we want to use so like I said we want to go with open AI so give open Ai and then we need to Define all the parameters which I have discussed in one of my videos temperature number of tokens what is the model you want to use so all these things I can quickly show you once again let me first type it here and here we need to define the model name which model we want to use and the max number of tokens so Max tokens equal to the ones which we have defined so let me quickly go to the URL again so let's go to the playground and these are the different parameters which we need along with the open Ai and these are the models so you can use any of these models so we are mainly dealing with text based so we can go with any four of these So based on what kind of results you want how much accuracy you want you can choose it so either you if you are just testing it you can go with text error001 which is like less optimized and will give you results close to activity I mean accurate result but it could be some discrepancy but if you are looking for very perfect results then I would suggest you to go with this one so let me go with text error001 foreign this is just for demo purpose because and it's gonna cost me less so that's the reason I am going with this one so once the predictor is defined we need to load the data so for I will just type it something okay for loading the data means that we need to read those text files which we have created so for that let's create a variable name docs and here we will be using simple directory reader which will take path as a parameter and has a method load data okay we are done with our data loading part next thing is we need to start the our Vector creation so we will say create Vector index and for that let's create the same variable name so that it won't be confusing later on and here we will be using GPT simple Vector index which will take few parameters the first one is documents and here we will just say docs the variable which we have created above then it requires llm predictor so let's give our predictor then it needs prompt helper so here I will say prompt helper okay let's go ahead and use this Vector index and save it to disk because whatever the vectorized data is we need to save it somewhere so that we can grab it again while performing the query and here you need to pass the file name in which your vector index would be stored so I will say vector index Dot Json and last line would be simply return Vector index okay so we are done with our Vector creation part let me quickly run it and see if okay so before running I need to grab the key here so let me quickly grab my key and I will paste it over here let me run it so that environment value variable will get created now we can go ahead and click on create Vector index okay so it doesn't executed successfully let's go ahead and verify whether the index is generated or not so Vector index would be the file let's see let's refresh this and then we can search for the file okay so before looking for the file I forgot to make a call to this function so let's go ahead and do that I would say Vector index and then we need to make a call to this function which is create Vector index herein we need to pass our folder name so knowledge user folder name okay let me go ahead and execute this is going to take few seconds over here because it is trying to generate the index so it is saying open AI error okay so let me go ahead and let's stop this one it is happening because references didn't flow properly so let me go ahead and do this again run this run this I'll quickly create this function and then we can go ahead and re-execute this so you can see that it's just successful now let me close this one let me go ahead and quickly verify whether the file is created so you can see that it just got created at 11 4 and this is the Json with 505 KB of data okay so next thing we need to do is we need to go ahead and write a very basic bot it is basic because I am not focusing on the UI here and just want to showcase how request and response takes place so let's write a small function for that and we will say def answer me let's name this function as answer me which will take Vector index and let me name this as V index to avoid any kind of confusion and then we will see use GPT simple Vector index to load data from disk and here we can just pass in the vector index okay next thing we need to do is we want to run this thing until uh via user wants to do so let's run it as an infinite Loop and I'm saying while it is true okay so till it is true we just we will keep asking user that do you want to say something so we'll say prompt input and let's say please ask okay and then we want to get the response out of it so let's create a variable named response we will use V index variable to query the output query and inside that let's pass in prompt then response mode which is what type of response you want so we just want the limited one so I will just say Compact and go ahead and print the output so I'm using App string here to print the response and format the string and inside this we will say response let's give one line break here okay we are done with this let me execute this and see if there are any errors okay and then comes the final call wherein we need to make a call to our answer me function see answer me and herein I need to pass the file of Json file which is the indexed one and let's quickly run it okay so on the top you can see that it is asking the question that please ask so who is there I will ask who is the author of this book and press enter you can see that responses appearing here the author of this book is PT Banner now let me ask more thing what this book is all about and press enter and you will get some response which is relevant to this book so you can see the responses this book is all about how to make and keep money and provides Financial independency so all these things which is coming directly from the book now you can ask any question or anything which you want to Output rather than you are reading the book and ask this particular bot to get the responses for you and it is very useful whenever you want to solve any question based on your textbooks or any of the documents which you have with you or let's say you have research paper and you want to grasp some summary out of it then that is one of the best use case I would say so this is how you can utilize this so like I said this is just a basic bot so I'm not providing any UI and just going with the default one which is appearing over here I hope you find this video useful and do not forget to like it and share it thanks for watching
Info
Channel: Shweta Lodha
Views: 9,650
Rating: undefined out of 5
Keywords: Python, Python for beginners, Artificial Intelligence, Python programming, Python tutorial, Jupyter Notebook, VsCode, Programming tutorial, Learn Python, Programming, Python coding, OpenAI, Machine learning, gpt-index, how to create chatbot using openai and python, create chatbot using gpt-index, How to create a chatbot based on my data, how to create knowledge base for chatbot, Shweta lodha, ChatGPT
Id: OWx49w-Zdhs
Channel Id: undefined
Length: 18min 48sec (1128 seconds)
Published: Tue Feb 21 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.