PrivateGPT: Chat to Your PDFs Offline and for FREE in Minutes (Full Tutorial)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video we're going to be checking out private gbt which is an open source project on GitHub that has been blowing up recently because it allows you to chat to your document but 100 offline that's right what I'm about to show you can be done completely offline because it uses local language models on your machine or on your laptop so that you can have a hundred percent private conversations with your documents without ever sending it off to open AI or any other company the reason this GitHub repo is blowing up right now is because of how simple it makes it to install and ingest your own documents into a custom knowledge base and start chatting with them with local models in a matter of minutes so within this super quick tutorial I'm going to show you guys how to download and install private gbt onto your own machine ingest your documents and then start chatting with them in just a few minutes but before we get started if you haven't already signed it to my AI newsletter called the drip we're giving you the condensed version of all the hottest AI news delivered direct to your inbox every few days so be sure to hit down below in the description in the pin comment if you can sign up to my AI newsletter if you haven't done that already so before you actually get into to installing it on your machine let's take a quick look at the GitHub repo you're going to be available to grab this link in the description but here we can see private GPT allows us to ask questions on your documents without an internet connection using the power of llms 100 private no data leaves your execution environment at any point you can ingest documents and ask questions without an internet connection now some of you have been getting bombarded with questions about local models and writing things locally so I think this is going to make a lot of you very very happy so let's have a dive into a few examples that it says here so the example it gives us here is why was NATO created Etc built using Lang chain and GPT for all if you haven't already seen my other GPT for all video it's going to be like here or here I always get it wrong but basically they make it super simple for us to set this up we're going to need to download the entire repo then we just can install all the required dependencies using this requirements.txt file now one of the best parts about this repo in my opinion is that it has direct download links to the models that you need to use so you need to click on both of these you can download them here if you click on it going to allow you to download both of these models these are absolutely required it can't work without downloading them so make sure you click on both of these and get them downloading here I've already got them downloaded so I don't have to do that just now and another thing that makes us so accessible to beginners is that it has a automatic ingestion script that you can easily run with whatever file you put in so super simple instructions here for how to ingest your own data so this can be PDFs it can be text files but I'm going to show you that when we get started in a second now the end result once we've got this all set up is you're going to be able to enter queries via the terminal and ask questions about your documents locally so let's jump into the installation scrolling back up to the top of the repo here we can click on this code button and then if we go download zip we're going to download the entire repo so that we can start to use it in our own projects once it's downloaded you're going to need to unzip the files you can use your favorite extractor double click on it if you're on a Mac it's going to expand it out to this folder here I'm going to be running private GPT within vs code so you can open that up with me or if you have a different IDE which you like to use you can use it also but easy way to open this for Mac users is to click on this folder here and just drag to your applications bar and drop it onto Visual Studio code and it's going to open it up yes I trust the authors now for Windows users you can open up visual studio code and head up to this file section here just open the app and then you can open folder and then you need to go to your downloads and open the folder that you've just downloaded as soon as you enter vs code you need to add a folder for your model so we can do that by clicking this folder button up here and go models and now I need to go to my downloads and drag all of those models that I downloaded the two different models and I need to drag them into that models folder on screen you can see the two models that I've downloaded I'm just going to drag them into this models folder and let them copy over to that with those models copied over correctly now we can install our required dependencies so we have this requirements.txt file we're just going to install everything we need to run this correctly as you can see back on the GitHub repo here has in this environment setup section A pip install command that's going to install those requirement automatically so you can copy this command head back to your terminal that we've just opened up by going new terminal and then I can paste that command in there and run it now if you get the same error as me here which is saying command not found pip you need to actually run pip 3 install so you can paste it in here we can see pip3 install requirements.txt now I can run that it's going to install all of these dependencies here and now we're ready to get started so taking a look onto the left we actually have a folder here that they've created for us which is called Source documents so this is where you can put any document that you want to ingest into it they give us by default the State of the Union which is sort of default land chain document we can roll with this one for the purposes of this video but you can delete this and copy in any document that you want say I had a a different document in my downloads I could just drag it into that Source documents folder and then you could follow this step along just make sure that the name is changed when you're running the command on the left hand side if you click on the ingest dot Pi file you can see that this is going to be using some Lang chain document loaders and text Splitters Etc out of the box private GPT doesn't actually have support for PDF files so I'm gonna have this little code snippet available in the description for you guys to steal which I've whipped up quickly here which is going to essentially allow you to put in your own PDF files without first converting them all it's doing here is doing a little check on what the document type is if it's a PDF it's going to use a different document loader from langchain but if it's a normal text file is going to use the the normal private GPT document loader in order to get this little PDF extension enabled you need to head up to the corner and go on the top row here we can see a text loader we just need to add in pi PDF loader as well to the document along chain document loaders and now this is going to remove the squiggly line there what you're also going to need to do is go pip 3 or just pip install Pi PDF and then one more line of code that you need to add is actually import OS and then everything should be ready to go and we can start ingesting our documents remember to save this file when you're done now we can get onto ingesting our documents if we hop back onto the GitHub repo we can see the instructions for ingesting our own data center here pretty simple stuff but we need to run Python and gest.pi and then a path to our text file so again this is using a text file but we have added support to allow you to use PDFs as well so if we just copy all of this I'm going to just pop it into the code down the bottom here just make a little bit of space I'll show you exactly how you can get this set up the easiest way to get the path to the exact document that you want is just to right click on it and click copy relative path then come here and delete the entire thing here including the carrots and now you have the entire command ready to go I can copy that head down to my terminal and go Python 3 actually in my case I need to change this to python 3. you may need to do the same and then we can come down paste this in and ingest it now this is going to run in your terminal down here for quite a long time it's going to go through the entire document chunking it up and then putting it into this DB folder up here that it creates and it's going to store your index here so that whenever you ask questions to it it can ask the DB and recall different chunks out of it so I don't want to get stuck in the weeds there but essentially that's where all of your database is going to be stored now after a couple of minutes that's finally finished indexing that information so what we have up here is our DB all ready to go and ready to be queried now we need to actually run the script run this private gbt.pi app which is going to actually allow us to chat to that data what we can do is head back to the GitHub repo here and it's going to show us to run it ask questions about your documents locally we just need to run this python privategpt.pi for me I'm going to need to run a python 3. you may need to as well Python 3 and then paste that in private gpt.pi and we can run that it's going to look all complicated while it loads these models up but now it says enter a query so I can say I'm talking about the State of the Union here so what did the prayers then say about NATO and here we see after a few issues with tokenizing these unknown tokens it's actually answering the question here as you can see answer the president has spoken out in support of NATO and his role in maintaining peace and stability in Europe emphasize importance of alliances Etc so it also shows us the question that we asked the answer that it's given and then Source documents as well so this is actually very handy to see what chunks were used in terms of creating this answer so I hope you guys have been able to get this working as you can see these sort of tokenizing Errors pop up quite a lot you may have different issues depending on the documents that you put in there now given that I've added this little PDF extender so you can use the ingestion script on PDFs as well you should be able to add in just drop a PDF in there and run the script and it's going to install it just to make this tutorial bulletproof and I know in a couple weeks and months coming up ahead this may change a little bit but be sure to check the repo for the latest information on how to install this it's all going to be available here if it changes but also if you run into errors in the terminal window here it can help to just copy and paste it into jgbt and ask it for an answer it might be a very very simple fix that you can do yourself without having to ask for help for anyone else so that's about it for the video guys I wanted to make this super short and sweet so that you can download this repo have a play around with it and start chatting to your documents locally without any internet connection now if you do have any issues with this and want some help you can join my AI entrepreneurship Discord which is in the description and in the pin comment you can hop in there and ask some of the developers within the community if you can get a little bit of help with this but beyond that if you also want to reach out to me at the consultant and work on any projects like this or other AI applications you can book a Consulting pool with me in the description and in the pin comment and if you want to work on my development company to build out something bigger then you can also get in touch with me through my Consulting link that's all for the video guys thank you so much for watching please hit down below and leave a like on the video if you've enjoyed And subscribe to the channel if you want to see more content exactly like this and I will see you in the next one [Music] thank you
Info
Channel: Liam Ottley
Views: 40,928
Rating: undefined out of 5
Keywords: privategpt, how to install privategpt, privategpt setup, chatgpt offline, gpt4all training, gpt4all install, gpt4all langchain, gpt4all free chatgpt, local ai chatbot, run ai chatbot locally, run chatgpt locally, local chatgpt install, train chatgpt on your own data, train chatgpt on pdf, custom knowledge chatbot, how to install gpt4all, gpt4all chatbot, langchain tutorial pdf, langchain tutorial, chatgpt local install, how to create local chatbots, what is gpt4all
Id: xs9TnY2z8jE
Channel Id: undefined
Length: 9min 23sec (563 seconds)
Published: Sat May 13 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.