ChatGPT for your data with Local LLM

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hi my name is Jacob I'm a software engineer at meta in the previous video I showed you how you can create a CH GPT for your own data with L chain and open AI apis uh link to the video in the description in this video I want to show you alternative solution so using open AI apis is pretty expensive depending on how large your document is you can quickly utilize the free $5 limit uh so for example running th000 queries on a document with uh around 3,000 WS would cost you about $7 uh to learn more about pricing uh check open a API pricing calculator I also uh dro the link in the description and you can learn more how much open a API can cost you so if you don't want to pay for open API you can use one of the open source large language models locally unfortunately open AI models are not open source but there are a lot of Open Source models that you can find on hugging face I uh also dropped the link in the description and uh with these models you can build a solution with L chain similarly how we did in the previous video and actually Ian Martinez created an awesome GitHub repo called private GPT that allows you to do that it uses uh sentence Transformers python package uh to create embeddings and it uses one of the open source embedding models and it's also using one of the GPT for all compatible llms which you can find on hugging face and it also uses a lang chain so without further Ado let me show you how you can build simp similar solution to what we build in the previous video okay so the first uh step is to go to GitHub and to find the uh private GPT there there's a link in my blog post and there's also a link in the description to this video and I'm just going to clone it I'm just going to copy this and I'll go to my console um I'll actually go to my Dev directory I'll I'll make separate directory um private chat [Music] GPT and I'll clone this varo here and when it's uh cloning I will now open U this code okay so now uh we can open R me here which um explain step by step what to do so the first step is to install required packages you can find what packages has to be installed in the requirements.txt file if we go here you can see uh these are all the packages that um you need in order to run the private GPT so I already did that step so I I won't do it but you can just copy this and paste to your console and this will install the packages so the second step is to download llm model and put it in some directory that later on you will point to it so I actually already downloaded two models so there is one default this U dgml gbt for all model but I also downloaded one that perform a little bit better uh I think I have them in my downloads models directory yeah so I have this gml GPT for all which is the default model from private GPT and I also have nounce hermas which perform uh much better and uh actually this hermus model is a little bit bigger so you can see here that uh this one is 7.3 this one is 3.7 so it's almost twice as big um so let me just copy this over I'm going to copy downloads models actually for to make uh directory so let me to private GPT I'll create directory called models and now I'll just copy it models uh to mods so this will take a while because um I'm moving over uh over 10 gab of data um so then when I have my models now I just need to configure the paths so again I can copy this I need to copy the example EnV file and you can look here uh close this and actually actually what I can do here is I can just remove this I can rename it so that will do so this is setting up 's the directory so this is where my embeddings will be stored because first step in creating this solution as um you learn in the previous video is I need to create a Vector representation of my documents so I will create the embeddings using the vector database and they will be persist in this directory so I chose directory DB um the model types gbt for all and the path for model is models SL gdml so you see my models are being copying over so I'll start with the defaults and uh I'll also use the default embedding model so now that's pretty much it uh preview of read me um now uh I have here Source document have a state of the union transcript um so I I I'll make it like a little bit more fun so I think I have on the desktop about MD file which is about me so I'll just copy it over to Source documents and I'll use that file I'll remove the default State of the Union and here in the about MD um this is just simple file um that says uh basic about me that I'm is software engineered meta have experience in web and mobile development product growth cloud computing and artificial intelligence and then it lists what products I worked on and it also list my hobbies Okay so so then we're going to ingest the document and then we'll query this document very similar to what I did in the previous video however different approach this time with local large language model so um models are being copied I have my source data uh not this preview I want to do go back to read me for one second uh T so I went through the setup I have data set and here in the you can see what types of files the private GPT supports so you can go all the way from uh markdown and text files to csvs or Word documents HTML of course PDFs which is pretty awesome uh PowerPoint presentations and so on so now I need to ingest the document uh what this will do it will take my text files and will convert it to Vector representation so uh let me copy this go back to my console and run python in just. piy and now it's creating embeddings using the specified embedding model and storing this into the DB folder so actually here you can see that I have new folder and these are just binary files so there's nothing to see here but these are my embeddings which is a vector representation of my data so now I can start querying my data so say python private GPT dopy and uh yeah let's see what do I have here so let's ask a question who is Jacob and let's see what uh what private GPT will tell me see who is Jacob I got the warnings that I four results but uh there's only one here and this will take a while this will take around 10 seconds because I'm using still a smaller models and here gives me an answer Jacob jedek it's okay uh let's uh ask some some other question um what are Jacob Hobbies this should also take around 10 seconds and here we go Jacob G has a variety of Interest including Sports such as soccer martial arts he also enjoys OU activities like biking hiking and TRL so this is actually pretty cool because this uh took the power of el M and apply it to to my data in this case and uh this bullet point list that is here it actually took took it kind of to Next Level and categorized uh these Hobbies um and also a private GPT it uh shows here like the source document um we actually don't need it you can um you can comment this out so if you go here to private GPT and search for source Source documents print the relevant Source used in the answer so I don't need it Comm it out so get better view so now um see it's it's okay U let me the third question when did Jacob work on seeing AI right so here I provideed data range 2017 2019 so let's see if this model will be able to figure it out again 10 seconds and uh let's see what we get okay so here we go the context that not provide information about the exact date of when Jacob work on sing AI H well um actually this data is there so let's try to use a more powerful model and see if the more powerful model will be able to deduct that information from the text so I'll just kill this and I'll go back to the EnV in here I'll swap my model with uh this one oh come on yep save and let me run the private GPT now so I I don't have to inest data again because creating embeddings is being taken care of by the embeddings model which I didn't change here I'm only changing the llm that will help me to take these embeddings as a context and answer questions I have or for my data so okay here you already see that this is taking a little bit more time right the model is uh twice the size of the previous one and uh also the queries will take two times more time uh sometimes between 22 to 30 seconds so let's see how long it will take to load it okay there we go okay so let's start with the same queries who is Jacob and now um again this will take between 10 to 30 seconds so uh I'm going to like fast forward and cut this from the video okay there we go so 30 seconds but look at his answer Jacob is a software engineer at meta experiencing one more development prodct growth cluter intelligence oh cycling trons hiking sailing soccer and martial arts okay sounds sounds pretty good um so now let's ask the question that the previous llm couldn't answer for me when did Jacob work on cing AI so again this will again take 30 seconds if I had more powerful machine or run as on GPU um ideally in the cloud parallel multiple gpus I'll get answers in less than a second however here I'm just using my local CPU on my MacBook M1 which is actually pretty powerful machine but you know uh to work with uh these large language models is not enough okay 28 seconds and look at that according to information provided Jacob work on sing AI from 2017 until 2019 and that's true um so you've seen now like what is the difference between more powerful and less powerful models so the less powerful is pretty okayish not great you use more powerful model you get a much better responses to your data and now if you use the best model which is the gbt 4 which unfortunately is a closed source so you cannot run it locally you get even better answers as you've seen my previous video so the trade of of running models locally is speed you seen that depending on the side of the model when I was running it locally on my machine it took between 10 seconds to 30 seconds to get an answer uh open AI models are deployed in a cloud and run on super fast gpus that's how you get answers in less than a second uh of course you can deploy your open source models to the cloud as well and run your apps on gpus uh but then you have to pay for it the big advantage of that however is that you have full control of your endtoend flow if open AI for some reason decides to close its API or drastically increase the price there's very little you can do about it but running operon models in the cloud puts only one constraints on you the cloud provider cost and potential changes to their environment so for example they might say that oh this offering like this virtual machine with GPU is no longer available but usually they provide uh PA to upgrade because they want you to keep using uh their compute so this is not a worry but the cost might be however I don't see this coming up drastically what open AI will do however that's very big question because they want to start monetizing and so on so they can actually drastically increase these prices especially because they have the best model and people might want to pay higher price for the best if you have any questions please leave the comment I would be very happy to answer them if you made it that far you probably found this video useful that's the case please hit the like button you can also subscribe to my channel to get notifications about the new videos

Info

Channel: Jacob Jedryszek

Views: 9,934

Rating: undefined out of 5

Keywords: ChatGPT, Python, GenAI, Artificial Intelligence, AI, Machine Learning, Deep Learning, LLM, Large Language Models, Embeddings, Q&A

Id: bWrjpwhHEMU

Channel Id: undefined

Length: 16min 20sec (980 seconds)

Published: Thu Sep 07 2023