How to Create LOCAL Chatbots with GPT4All and LangChain [Full Guide]

Video Statistics and Information

Video

Captions Word Cloud

Captions

in this video I'm going to be taking you through step by step how you can set up GPT for all locally on your machine and then we're going to start jumping into building applications with Lang chain to show you the different use cases and see how good these models really are now within this tutorial I'm going to be taking you through a number of different ways to install it because it can be pretty difficult as I found out myself now I'm making this video because I've had a number of requests and comments on videos and also through my consultations with clients about the privacy concerns that people have regarding where the information is being sent off to If You're Building applications with open AIS apis what about all my customer data what about all my organization data I don't want to be sending that off to them how can I run this locally or how can I safeguard that data so that is protected and not exposed in any unnecessary way now there's not many solutions currently about this issue of data privacy within AI based applications but what we're going to be going through today is one most promising solutions to it which is to start creating powerful language models that are run locally so we don't even have to send all of our information off to the apis we can run it on machines and it eventually run it on our servers that we are deploying our applications to now if you're not familiar with what GPT for all is it's GPT for all and for everyone and it is a open source application a set of language models that you're able to download and start using locally and here you can see in their own words they have demo data and code to train open source assistant style large language models based on gptj and llama so gptj is the open source version of GPT and the Llama is Facebook's language model framework that they've released recently but it's still actually restricted to research purposes only now over the past few days I've been going over this technology and seeing how far I can push it how much I can get out of it so what we're going to be doing in this video is a quick guide on installing which can be quite difficult if you're a Mac User like myself it caused me a lot of issues I basically spent an entire day trying to get this thing set up so if you're a Mac User you'll be getting a crash course on how to get this thing set up if you're a Windows user I think you're going to be having a lot easier time so you'll be able to skip ahead but we're going to get stuck into installing this now I'm going to be taking us through a number of different ways of actually implementing mention this code and using these language models but the end goal here and the final product that we're going to get to is a custom knowledge based chat bot like many of the videos that I've done previously but it's going to be run locally so you're going to be able to provide your own PDFs or documents let's go into chop it up chunk it up embed it put it into a vector database that's run locally and then you're going to be able to query that through a chatbot style interface that has chat history so a lot to take in there but we're going to have a custom mods chat bot with chat history as the end result so stick with me because we're going to get there in the end so let's jump into getting this thing installed on your guys computer now in order to get started with installing this we need to head over to the GitHub repo which I'll be linking in the description but here we can scroll down and start to take a look at the different ways of getting this installed and to start using it locally now they've actually recently released an application that allows you to run it in a nice application on your laptop so that's what you can see here you can download that and have a play around with it and actually test the abilities of it and see how good it is and how it Stacks up to the sort of GPD 3.5 turbo and all of the ones that we're used to playing with so if you want to check that out super easy to download store you've got the three different operating systems here that you can download but we're actually interested in building applications with this so being able to use it and chat to it is basically useless to us maybe we just use chat TPT so we actually want to get access to these models and interact with them programmatically so we need to scroll down and actually start building with them now if you head down to the python client section we have the first way of getting the setup and installed so what you're going to need to do is run this pip install nomic command so if you head over to vs code here what you can do is come into your terminal section down the bottom and then you can run pip install nomec now it's all installed on my computer already so I can head back and we can copy this code here which is the basic setup and here we have it paste it in this so what you can see here is this is The Notebook library that we just installed you're importing this GPT for all so in order to test if this is actually working you need to run python or python3 and then the name of the file so I've named this Nomex setup.pi so now if I run this you can see that it waits for a bit so what that's actually going to do is start the download of gpt4 all so that it is downloaded on your machine locally here in Colombia my my internet is quite slow so I haven't been able to do it using this method but if this works for you initially and you're able to start prompting then you're away to the races and you can start plugging this into all of the stuff we're about to do but this is essentially the first way of testing if you can get this set up and working now if you run into issues and you can start troubleshooting it and going through these different stages that I'm going to show you now so heading back to the GitHub repo we go back to this python client section and now we're going to go through to this link here which is the official python bindings for GPT for all now this is called Pi GPT for all and this is what we're going to be using in the next way of installing and setting it up so as usual we need to install something so we can copy this pip install Pi GPT for all and I'm going to head over to this file all of this code is going to be available for you guys to download so here we have the basic setup that I'm going to show you in a second but down in your terminal you need to go pip install Pi GPT for all it's already set up for me then once that's installed you can head down to the usage section and it's going to provide us a snippet of how to set things up so if we copy this and hit back this is essentially what we've got set up here so this is just importing the modules sitting the model from the link which you've got here and here here in the code what we have is a way to load the model so what we have is GPT for all and then the route where the model is stored on our computer so you're going to need to download this file I'm going to have a link to it in the description but over on the side here you can see a folder called models and we have the GPT for all converted which is the model we're going to be using and then we have a second one here which we're going to be discussing later so what you need to do is download them using a link that I'll provide in the description so you have to get this exact one and I had a nightmare trying to get my hands on this because the ones that they actually provide a lot of the time through the documentation they need to go through this conversion process in order to get them to the state where they can work so I'm going to save you a couple hours of headache by just providing you with a direct link to the converted model that should be ready to work as soon as possible for you locally so head to the description and download this GPT for all converted.bin and then put it into your models file here and then you can come back and now we can check this code yep it all looks good here we have the prompt that we're putting into the model and then we can come back save this command s and then go Python 3 Pi GPT then we can go python3 or python whichever your system setup is and then the name of your file so this one for me is pi GPT for all setup so if I run this then it's going to run this completion once upon a time once upon a time 100 million years ago dinosaurs ruled if let's wrote in a little written a little story for us here now if this has worked for you that is great news but if you're a Mac User you likely had a little bit of trouble and it's not able to get this output yet so I'm going to help you through a crash course on how I was able to get chat CPT to help me solve this which saved me a lot of time so if you're using anaconda or condo that was what was causing the issues for me so I had to actually uninstall Anaconda fully I have all of these instructions in a document down in the description if you want to go through this and follow this but I'll just briefly run through it now I had to uninstall Anaconda then I had to download and install mini Forge following these instructions here so once I had mini Forge installed I had to set up a virtual environment so that I can install all of the packages onto this virtual environment that was separate so I know it sounds complicated but all of these things are going to be in it in the description for any Mac User that struggled so you can just follow through the steps that either order to get this virtual environment set up that is working correctly so as you can see down here I'm actually in a virtual environment called GPT for all now if you're not using condo or anaconda you might not be running into any of these issues so that's great but if you are then you can follow those instructions in the description now that we're able to get a basic setup running with this little script here we can now get onto a couple more advanced applications using langchang which is the really exciting stuff so if we head over to basic Lang chain here this is essentially going to be doing the same thing but using prompt templates and change from langche so if you're following along and want to reuse this code you can download all the code from the GitHub repo in the description but we're going to be going through this now so the key thing here is to get land chain installed but you need to get the correct version now because this is such a Cutting Edge Library it is being updated every couple of days and you will run into issues if you're not specifically mentioning their version of land chain you want to install so I'll just run you through that now now whenever you're watching this video I urge you to head to this link here Pi Pi which is the python package index and you're going to be able to check out the most current version which finally I did this literally yesterday today and it's already been updated again so this thing is being updated nearly every day or two so in order to prevent any issues you need to specify the exact version you want to get installed but make sure it's the latest version I'm going to be leaving a link to this website in the description so you can check what the most up-to-date version is when you install so in order to install the correct version of land chain what you need to do is copy this then head back to vs code then you need to go pip install Lang chain but at the end of it Go equals equals and then enter in the latest version here which is 0.0.1149 so you'll see able to actually update it for me because I had 0.148 installed so make sure you're getting the latest version and you're actually specifying that with the equals equals and then the version after that so now that we have it installed all of these things up here should be working for you as you can see we're using the GPT for all import from the llms package so same sort of thing here we have the local path for the the model that we're working with which you can see over here the Callback manager which you need to set up as part of Lang chain and then we have a template here now this is just an example one that you could use here we have a question and then the templated insertion of the question here then the answer listing step by step so it puts the model into a Chain of Thought style of answer which typically works better for these models because they're not as good at responding as you're used to with these other open AI models and so for the rest of this we're just setting up the prompt which is using the prompt template we've provided the input variable of the question so whenever we ask a question to this as we do down here and here it's going to be replacing that question parameter with the question that we're setting so in this method here which we're just hard coding the question which is what does NFL team won the Super Bowl and the year Justin Bieber was born we can hard code in the question here and then below it we also have a way of accepting user input through the terminal so we'll just run through both of them now quickly the final part of the script here is to use llm chain dot run so we've set up our chain here using the llm and the prompt we have and then the question and then the llm chain is going to run that process so now if I go to the terminal and I go Python 3 basic Lang change setup dot Pi run that it's going to set up the model and then it's going to ask the question whether we've hard-coded in here what NFL team as you can see here and now it's going to show you the output of the model in the terminal here so that's a bit of thinking now here we get an example of the actual quality of the output of these models so asking a question what is NFL team won the Super Bowl and the year Justin Bieber was born so it should go okay what year was Justin Bieber born okay and then you can use its prior knowledge of both who Justin Bieber is and then off that particular year and who won the Super Bowl to give us the answer and this Chain of Thought method is supposed to help it get there but as you can see it sort of just goes on with gibberish I've seen other people get create responses with this but I haven't seemed to be able to do it this potentially could go back to the fact that as we can see here on the GitHub repo they do have a GPT for all J model and they do have just a GPT for all models so maybe the differences between gptj and in the the Llama model they're using could determine the quality of the output but I urge you to actually just download the other model unfortunately my instance a bit slow here so I don't have time for this video but here you can see they're using the GPT for all J version you'll need to convert it but once you converted it you can actually test the output back to back and see if it's actually getting you closest First example of getting output from these is not that great so we're going to keep going and see how good we can get a custom knowledge chatbot to work with this as well and just quickly you can turn this into a user input based bot by uncommenting this line we'll comment this one out so this is essentially just asking for a input so when the slide runs it's going to pop a little input thing down in the terminal and then whatever you enter into that is going to be submitted as a question and then work back into it I just commented out and I save the file again and then I'm able to run this again and now you can see down the bottom it's saying enter your question and I can enter the same thing so as you can see it didn't quite get the answer again but this shows you how you can use a variable input and a user input from the terminal to ask questions to this model so next we're going to be going on to a creating a custom knowledge base and then being able to query that base in sort of a one directional format so this is like creating a llama index you're able to ask on question two by querying that index so this one here is called custommodgequery.pi and we can go through this and I've got a little bit of information on the the script here the script allows you to create a vector store from a file and query it with a question that you hard code the question in it shows you how you could send questions to a GPT for all custom knowledge base and receive answers if you want a chat style interface using a similar custom knowledge base you can check out the other file which we're going to be going over last so this starts off a little bit differently here you can see we have the GPT for all model path which is the same as previously but here we have the Llama path so you're going to have to download by a link I'll provide in the description it took me a while to find all these different models to use they've got this thing working because making them sort of talk to each other can be quite a bit of a struggle so all of the download links for these are going to be in the description but you need to download the specific model which is a llama model that allows you to use the embeddings process of of the Llama model and that's done through here so what you see on the left side here is the models we have both the gbt4 all converted and we also have this llama model on the side and so what we're doing here is taking in an example document which is the state of the union here you can see it's up in the docs file here so we use a shortened version first test that it works and then we use the full version and so this is just the State of the Union a speech from Joe Biden ice Hume talks about Ukraine all that sort of stuff so we're going to be taking that indexing it in the same way you would into a llama index but we're going to be doing this all manually the process here is to use this loader here which is a text loader this is coming from Lang chain so this is going to take the docs from the docs folder it's going to take the shortened state of the union.txt file that we have then it's going to create the Llama embeddings model that we're going to use to embed all of that information into our Vector store and then we're going to initialize the llam as GPT for all with the the path to the model that we referenced up here so this is essentially all the setup and then we have a couple functions that's going to allow us to create this custom knowledge base index so first of all depending on the size of the document you may need to chunt the text up so we have a split chunks function here that's using this particular function from Lang chain we can specify the chunk size all that sort of stuff but I just leave that there for now and then we need to actually create the index in the same way you would with something like llama index and then that's going to take in the chunks we just created and it's going to return an index and we're using this if a ISS packaged as part of the Lang chain Vector store so this comes with as a supported way of creating a vector store or a vector database with land chain which is very handy and what this is going to do is return that search index for us so that we can start using it and then finally we have this similarity search function which is going to take in the user query and it's going to take in the index we've created and give back essentially the match documents that match the query so this is what we're going to take into the the final generation phase of okay we have all the context now let's actually answer the question so this is the same sort of custom knowledge base chatbot style setup so now what we need to do is actually call all of these different functions we need to take in our documents create an index and then we're actually going to save the index so that we don't have to keep doing it again because as you'll see these models can take a very long time to generate all of the embeddings and create the index so I'm going to have this line here which allows us to save it to a local file so that we don't have to run over this every time we boot the app which is something I'd suggest so first off I'm creating our index it's going to need to create the docs using the loader up here which we've created so this is reference to the shortened State of the Union text the little slash of it we're going to test first then down here you have the chunks which it then takes the documents that it's taken from that text file and then it's going to split that into chunks and then we're going to take those chunks and we're going to create the index out of it so on your first time actually running the script what you're going to want to do is actually comment all of this out I'd say and then when you run this it's going to go through it's going to take your document it's going to chunk it up create the index and then it's going to save it to this name that you put here and as you can see on the left here I've actually already got this set up here so this is the state of the union index that's the Little Slice that we're talking about before so this is a small index on a on the shortened set of text that we've provided and once you've actually got it saved there and ready to go then you can comment this out and you can uncomment all of this below then every time you run the app from then on you're not going to have to wait a few minutes for this to index the data and save it you can actually just pull it from a local file by using this line here so pulls from the same embeddings that you did to create it which was set up there and then you're going to be pulling from the local files on the side to get the index set up and now finally we can actually query our custom index so you have the question here hard coding the the questions to summarize the contents about NATO and its purpose so what I've done here is actually taken uh just so it's a little bit more context I've done the full document it's taken quite a while off screen but I've got the full State of the Union here indexed and saved using this method above and what I'm going to do is ask it to summarize the comments about NATO and its purpose and here we have another prompt template which is essentially using the land chain to help get the right outputs out of it and essentially frame this up and here we have the similarity search based on the question and the index we're doing so it's going to search that Vector database for the most relevant bits of information on it retrieve that and it's going to give us back in match docs and sources and what it's going to do is essentially please use the following context to answer questions the information that we get back from that similarity search is going to be injected in place of this context variable and then the question of the the user that we took up here that we've hard-coded is going to be put in here so essentially what's the question okay let me grab some context that's relevant to that question put it into the prompt template okay and then what were they asking for in the first place bank now answer the question with all of this context so this is the basic sort of retriever generator system right in front of your eyes and then again just to make it very easy for this model we're using Chain of Thought prompting if you don't know what that is check out my prompt engineering video it's up here we're here I never get it right but let's think step by step so actually thinks through it and it increases the accuracy of the output and then finally what we have at the bottom is just the Lang chain side of things which is packaging this all up taking the taking the template inserting the variables here you can see getting the context back from the similarity search and then sending it off in llmchain.run function and then we're going to get to see the response so let's give this a test out the comment is summarize the comments about NATO and its purpose and we can have a quick look at the State of the Union the full text I'm sure you're familiar with what they State of the Union sort of talks about and then we can go back to this here so now that we have everything set up to query this index we're going to be able to go down to the bottom and into the terminal and go python3 or Python and run custom Lodge query.pi and it's going to take essentially all of the information from the state of the union index which we have created it's going to query it with this question summarize the comments about NATO and its purpose and then return the answer to that by injecting it into this prompt template so let's see what we get so it's init from file which is going to save us a lot of time actually I forgot to do one thing which is comment all of this out as well so if you are running it from a a loaded index then you don't need to do any of the create index of course so make sure you head up there and comment that out and then you can cut back down it's going to save you a ton of time we'll save it again clear the output and run that again and see how quickly it can do it this time so bam straight into answering the question so here you can see it's actually starting to spit out the things that it's retrieved from the custom knowledge base so it's got a couple little Snippets of text from the speech okay it's got a couple mentions of NATO in there it seems to be doing a pretty good job and then it's going to take all of that context that it's it's grabbed put it into the prop template and answer the question with that context okay and here we have our answer it seems to have queried our custom knowledge base correctly we've got a context it's asked the question summarize the comments about NATO and its purpose so NATO is an alliance that was created defend its member states so it's actually done a pretty good job when you've given it a custom knowledge base compared to when it was actually querying its own knowledge that it was trained on so perhaps this is a good example of okay maybe the training data was not as extensive and familiar with current events and the history of the world as we are used to with chat CPT because it just seems to know everything because it had obviously has a better training data set so these models maybe they are better to be used with their own data sets rather than asking it to answer questions like a chat TPT or gpt3 so that was an example of how we can create our own custom knowledge based index and query that and draw insights out of it without ever having to use an API or open Ai and send that data off so everything was done locally on the computer we have the Llama CCP embeddings that we use for the embeddings and then we have GPT for all to generate the answers and now we're on to the final example of what you can build with this now I took a while to get this thing set up and using Lang chain with these custom models but what we have here is a custom knowledge chat bot in the same way that my previous videos went over but I'm able to chat back and forth with a chat history about a custom knowledge base so we're going to go through many of the same steps we did just before but instead of allowing it to just hard code a query or only take one query and get the information back we're going to be able to have a chat to this thing and it's going to remember the context of the chat so that it's kind of like a chat GPT experience but it's limited to the information that we're giving it so most of this is the exact same as the previous example if you go down here you'll see many of the same sort of functions but here we're setting up the the language model as GPT for all I've put in here a context limit which I believe is the max tokens for this particular model and then we have a lot of the same stuff we had before which is to generate the index so all of this code is still in here if you want to use it on its own but because I've already got the index added from the previous step we're able to actually just load the index directly and start getting into using this as a chat so as you can see at the top here I've got the index passed as full saw to index and here on the side you can see I have the full saw to index folder here which is the index we're going to be querying so same sort of thing you're going to want to uncomment all of this and get it to generate you an index but what we're going to be doing is loading it from a local one to save us time the key thing here is to use this conversational retrieval chain from llm which allows us to pass in the LM we want to use which is GPT for all which we set up here and then it's going to take the index that we had created and then use the index as a retriever function I've put in the max token limit here I've had a couple issues with the max token limit on this causing it to break so I've set that down to a conservative 400 now that's going to limit the amount that it retrieves from the index to 400 tokens and then we're going to get this QA over here that we query within this chatbot Loop and then the rest of this is a bit of a standard chat bot Loop here we have a a while loop and here we have a chat history empty list and then we're taking the query from our user input via terminal and we are passing that query as the question to this q a compositional retriever chain and with also passing in the chat history so every time it goes through this loop it's going to take the input from the user it's going to query it with the chat histories which on the first time it's going to be empty but after every query it's going to be appending the messages to this chat history and then every time you message it it's going to be remembering what was said before so what we can do here is actually save this and then come down here and go Python 3 and then I can run this it's all set up here to go and it's asking me for a question so I can say what did the president say about Ukraine now this process can take a very long time as you're seeing here this is I think a sign that this kind of Technology isn't quite at the level that it needs to be to create fully fledged custom knowledge systems that I run locally but I think it's a good example of getting familiar with these now while they're available and I think in the next couple of months and years these things are going to become a lot more powerful and a lot easier to run faster Etc so just getting familiar with them now and how land chain allows you to interact with them and set up chat Loops like this that have a custom knowledge base to talk to is pretty cool so here we can see I've asked what did the president say about Ukraine it's got the context that it's Dragged In Here using the prompt template that we can see and then the answer that it's giving me is he said that he is always there for them and their potential to live life without fear or discrimination and now we can actually test that the chat history is working by asking a question referencing that answer and then see if it gets it right so I'm going to ask her the question of who are they being discriminated by and see if it can pick up on what I'm trying to talk about so as you can see the quality of these kind of custom mods chat Bots with history is much much lower when you're using models like this and I think that comes down to not using the open AI embeddings I think the add a 002 text embedding model that we typically use is a lot better it's got 1536 Dimensions I believe so it's a very very complex and advanced embedding model and I'm not too sure about the Llama meetings model that we're using here I think just the combination of a different sort of locally run and open source embedding model combined with a locally run and open source language model to generate the answers gives us a pretty pretty low quality result but I think as I said before testing these out and getting familiar with running them is important at this stage in the development and so it seems like it's got the answer right after a long wait the Ukrainian people have repeatedly shown that a lot a lot so I think they it's picked up on who I was talking about but again this the quality of output on these things are not that great right now but that's been about a brief rundown of these different use cases for using GPT for all and how you can get a custom chat bot you can get a custom knowledge base and query yet you can steal all of this code and put it into your own projects and start playing around with it and I hope that the installation side of things was covered in a good enough way because I had a nightmare trying to get this set up on my Mac so hopefully one of the ways that provided works and if you're a Mac User and you have had trouble then be sure to check out the documents in the description where I'll give you a full rundown of how I got it working on my Mac but that's about it for the video guys I hope this has helped you figure out how you can potentially address privacy concerns and the fact that we're sending all of our data off to these these companies like open Ai and that maybe local models like this are the way forward but I think in the current state that they are probably not the case just yet now as always if you enjoyed this video and want to see more like kit be sure to head down below and subscribe to the channel if you enjoyed it please leave a like would mean the world to me I make a lot of tutorial Style videos like this to help entrepreneurs and people looking to build businesses and build applications with AI I try to simplify it down as much as possible and give away the code so if that sounds interesting to you I've got a ton more on my channel you should check out you're probably seeing it in the recommendation bar right now but there's a lot of stuff on there slack Box Etc customized chatbot so check that out and if you have any things that you'd like me to make videos on in future I actually made this one based off a comment that I received on a previous video so I am giving you guys what you want and this is actually super interesting stuff so if you have any any ideas or videos you'd like to see or queries that you've run into then be sure to leave them in the comments below and I'd love to get back to you as soon as possible with the video in the next couple of weeks and of course if this has lit up any light bulbs in your head and you think you could utilize this kind of technology and some kind of application then be sure to hit down below you can contact me as a consultant I'm available you can book a call with me in the description and in the pin comment and from there we can go on and talk about feasibility how much this could potentially cost to build and if you want to move forward and work with my development company then we can discuss how you can work with us and the kind of things that we can get set up for you so that's all available in the description now my development company has also got a newsletter that we've just started so if you want to sign up to that and you're getting the hot takes and juicy AI news delivered straight to your email inbox then you can sign up to that in the description and the pin comment and likewise my AI entrepreneurship Discord is available in the description if you want to join a community I've like-minded entrepreneurs developers marketers Etc we're all in there having a chat about all the things relevant in AI at the moment but that about wraps it up for the video guys thank you so much for watching and I will see you in the next one foreign

Info

Channel: Liam Ottley

Views: 35,731

Rating: undefined out of 5

Keywords: gpt4all, gpt4all training, gpt4all install, gpt4all langchain, gpt4all python, gpt4all free chatgpt, local ai chatbot, run ai chatbot locally, run chatgpt locally, local chatgpt install, train chatgpt on your own data, train chatgpt on pdf, custom knowledge chatbot, how to install gpt4all, gpt4all chatbot, langchain tutorial pdf, langchain tutorial, langchain in python, chatgpt local install, how to create local chatbots, what is gpt4all, llama chatbot, what is llama ai

Id: 4p1Fojur8Zw

Channel Id: undefined

Length: 26min 24sec (1584 seconds)

Published: Fri Apr 28 2023