Easiest way to deploy privateGPT app! Ideal for Businesses & Organizations

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone so today we're going to talk about something that became very popular in the last few days which is private gbt so specifically we want to talk about how one could deploy private GPT and even to begin with why someone would need that so let's look at how you can do this even without any coding experience so the easy way to understand private GPT is that it's a chat jpt with complete data control and why would that be needed so in case if your data is sensitive or if it is something that cannot leave your organization or maybe you have privacy regulations in your area which does not let user data to be sent out of that country or it contains some personal data like tax documents or other document which includes your personal sensitive data that you cannot send over through any chat application like chatgpt in those cases you can benefit from private GPT so a typical setup for that would be an Enterprise where there is tremendous amounts of data but it cannot leave premises because it contains a lot of intellectual property not just that it might also be an organization like government organization which has a lot of confidential data and which cannot leave that organization it could also be something along the lines of privacy focused user data which is heavily regulated for good reasons so this is something that we have looked at previously where any chat with document or chat with PDF application and how they operate they start with taking any document and then they convert it into vectors or numbers and and those vectors are saved in a database which is then searched through the queries to make it easy for us to understand these systems one could decouple them and look at document ingestion system and retrieval system as two separate flows where the document ingestion system as mentioned takes documents it extracts text from it and then split that text into small chunks which is then converted into numbers or vectors using an embedding model and then those vectors or numbers are saved in a vector database and similarly retrieval system also follows a pattern where instead of taking the document it takes the question and it converts the question into embedding using a similar embedding model and then searches through the vector database to find the nearest neighbors and then provide those as relevant answers that goes to our generation API for completion of the answer as you notice there is a call to open AI made for embedding of the documents as well as there's a call made for embeddings at the beginning of the retrieval flow also we call openai for a generation of the final response so three of those calls and other aspect is the vector database so in private GPT embeddings are stored locally and we call these models locally on our server or computer and then the database for saving vectors is also local so none of the data leaves our premise and even the generation of the response happens with a locally stored model so both of these flows have been completely local on your server or your computer so if one were to look at the original repo of private GPT it is quite popular over 19 000 stars and there's a lot of contributions in many activities happening so the requirement for the setup and installation requires you to download these models on your computer and then run accordingly so I thought why not make it into an API which could be connected with any front-end so that's what I did I took the original repo and then converted in into an API that could be called and that could be deployed as a backend and also a front-end which is quite basic just a streamlit template so there are two ways we'll talk about deploying private gbt one is a very easy way just one click and the second is to deploy using our own server so the easy method is basically clicking this link which will take you to a railway based deployment Railway is a service similar to render that we have used quite often and it makes it easy for you to deploy any given template so before you proceed further I just wanted to inform you that I did run into many issues on Railway so it works a few times but it also causes issues so I would suggest perhaps to skip the portion of Railway deployment you could certainly watch it and see and maybe test it out and maybe it doesn't give any error on your end but I did run into a few of those there is a similar way of deploying this app on render.com that's something we have been using in our videos so if you'd like for me to make a video on that I could do that if not the deployment on local machine or server is towards the end of this video so in this case you just need to hit deploy and then give it a name I'll call it private GPT web app and I'll make it a private repo as well as some of the environment variables are already filled in you don't necessarily have to change this unless if you'd like to use a different model so in that case you could specify the model over here and the model path will be something that will be automatically filled in as well or if you like you could fill this out so basically you just need to change the model type and everything else should be fine and then you hit deploy and what's going to happen is it will create a project for you on Railway and it goes through a series of steps where it's gonna run the scripts that's available for it to start deployment and build the application for you and it takes about I would say around 8 or 10 minutes or so to build the application make the links available for you then you can run the application one thing I would like to mention is that the free instance which is been used right now is probably gonna cause some failures it's not the most reliable if you do plan to use private GPT on Railway it's highly recommended to use one of their paid plans which basically charges you per minute of usage if I'm not wrong and when you do that it will give you about eight gig of memory which is going to be needed for this application so in free mode most likely it's going to cause failure in terms of running so you might be thinking why deploy private GPT on cloud so even in this case you still have full control over your data as it's embedded as well as stored within this application and if you like you could just delete the application and that takes care of deleting your data so the deployment on streamlit does take some time and once that is completed it will show a green check mark and that everything is working fine if you click that you'll see this streamlit application where you can upload your document as well as retrieve information from previously uploaded documents and we'll look at functioning of that in a few minutes so now we're going to look at something that could be locally on our computer or server and that's where we go back to the repo discussed before so the way we could deploy the private GPD app and again just to distinguish private GPT original repo can help you run it on Terminal whereas the app that we're looking at will be something that you could use as a backend API or you could use as a full stack app for any of your data so in this case we will first take the repo link and within our terminal we'll go to projects directory and git clone the repo and once that's done we'll CD into the repo and we'll open that in vs code and within vs code I'm going to look to read me for instructions so first thing is to create a virtual environment that's what I did and I called it private GPT and next is to have a environment file so there is an example.env file where you can just take that over and then create a new file called dot EnV and just paste everything in that file as you notice there is no API key in this case it's all happening locally so that's nice and next step is to install all the requirements so this command can help you with that so that's what we're gonna do so I had ran some of these before so it's gonna be mostly satisfied and it's completed it does take a few minutes there is a lot of dependencies so once that happens now there are two parts to this application one is the backend API and the other is the the front end so we'll run both separately into bash terminals so First Command builds the fast API backend and it's you know you can test it out by opening in the browser and you'll see it takes a few seconds and what's happening is I wrote a script to download all the models for you so you don't have to do it manually and this does take a few seconds as you notice and once the model downloads you'll see that application started complete message is displayed that means that our backend is ready we can test it out at the link that was opened before so now it says okay the apis are now ready for your inputs and queries great second step is for us to deploy our front-end if we would like to test it out in terms of connectivity to the backend as well as how the application works so I'm going to run this streamlit so this is quite fast compared to the backend API there's not a whole lot happening except that it's building the streamlit application and once that builds you'll see this private GPT app you have an option to upload any documents so in this case we will upload Constitution of the United States and when you hit embed you'll notice that it's gonna run through the backend and it's going to start basically doing the the splitting and then chunking of the document as well as embedding then so there will be a few of these messages that you'll you'll notice throughout now it takes again a few seconds for this to complete and when that happens you'll see a message which says documents embedded successfully so this step is relatively quick compared to the next step which is querying our data and you'll notice that the document that was uploaded is now available in the drop down list and we can search so I'm going to ask what does Article 1 say and once you hit retrieve now it's going to perform a few actions in the background so this step does take some time to query and there are some discussions on how to make this faster and we'll talk about that in our next few videos there are also other repos that do a similar task of building a chat with your docs application locally and perform the embeddings as well as querying locally so we'll look at that in the next few videos so it gives an answer saying what this article one say as well as some references and if we were to look in our streamlit app we can see the response as well as the documents provided or the source documents those are provided so that brings us to our last point which is if you are an organization and would like to build an application similar to private GPT that you have complete control over data please do let me know we'll be happy to discuss that with you and we can make custom applications along those lines also do share with someone who might be looking for a similar application with that thank you so much for following along and please reach out if you have any questions
Info
Channel: Menlo Park Lab
Views: 9,946
Rating: undefined out of 5
Keywords:
Id: p35GygHpxoI
Channel Id: undefined
Length: 12min 41sec (761 seconds)
Published: Sat May 20 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.