[Outdated] How to run a Chroma Vector Database locally and on AWS! | EASY MODE

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everybody Timothy karambat co-founder of mintplex labs and creator of anything llm which is a full stack personalized assistant that allows you to have the ability to chat with your own documents so today I actually want to have a video that talks about launching a chroma Vector database for just you on AWS and also we're also going to show how to do this locally as well because that is also important so I think the first thing to talk about is what is chroma so for llms language learning models like gpt3 or gpt4 or even custom models that exist out there in open source equivalents if you want to be able to refer to text outside of the context window of something because some of these custom models and GPT 3 and 4 are getting increasingly different size context Windows maybe you want to refer to something that you talked about a month ago maybe you want to reference a document that's 80 pages long and all you're talking about is just a simple paragraph ing information essentially allows you to take large large pieces of text it could be an entire book if you wanted to put that into a vector database and then basically Vector databases give you superhuman ability to semantically search across that text and just extract the chunks that are relevant to whatever question that you just asked now Vector databases there are a lot of options out there the cool thing about chroma and why I want to focus on chroma today is because chroma is open source frankly I think it is very very well featured they have great support for both JavaScript and python if you happen to be writing a single-use script and also it's really easy to put on AWS and also to run locally if you have Docker and you're more technically inclined also another thing to note is that chroma has absolutely great documentation and they have a Discord where the founders do actually talk to the community and help you figure out issues so very good Community small but very very helpful and honestly chroma I am a I'm a stand of chroma um I do like Pinecone don't get me wrong and Lance DB two other Vector databases supported on uh anything llm but chroma right now just has my heart because of the open source nature and also just how much how many features it has okay so enough about that let's get chroma running locally now one thing to note is I am on a MacBook Pro and I believe I am actually on so I don't misrepresent this I am on Mac OS Catalina this is my version number uh I'm too lazy to upgrade to the newest Mac OS so your mileage may vary depending on what operating system you're running luckily chroma's local instance is dockerized so if you can run docker will probably work so one of the first things that we need to do is start our Docker local Docker instance and so I'm doing that right now okay so now we have wiped out everything that was on my Docker instance so there's nothing going on in the background there's nothing I've already done let's start from the very beginning so the first thing you'll need is a terminal and so with this terminal you can copy this repository anywhere you like but I'm just going to clone it to my desktop and what we're going to do next is go to trichroma.com or docs dot trichoma basically we just want to get to the GitHub that's what we're trying to do today and what we're going to do is code and we're going to go to SSH and we're just going to copy that and then we are going to go to git clone that this is going to clone the entire repository and is going to create a folder on our desktop called chroma hopefully you understand this bit because this is pretty necessary and there's really no other way to get around this so that's how git works okay so we have successfully get cloned the chroma repository so now the next thing we're going to do is we're going to want to hop in to this folder and there is a command that you're going to want to run for client server mode now we want to run a full chroma instance and basically when you're running this locally you're pretty much only limited to the resources of your machine so uh embedding a couple million documents honestly is uh or a couple million in bettings is honestly nothing um it'd be very surprising if you start out with multiple millions of embeddings um we want to run in client server mode because you probably have either anything llm running or you have a custom script either way you want to connect a client to it well you need the server so let's start spin that up and the only command you need to run is Docker compose up and build the entire container this process will take a while because you need to download the chroma image and also the click House Server image just give it a while it'll spin it up instantly and then we can go from there okay Docker has built everything and if we open up our Docker desktop you'll see that we actually have a container running that oh a folder that has two containers in it the click House Server and also the chroma latest build and you'll see that they have Port 8000 is what this natively builds to um something to just know about this is be sure that you don't have something running on this port or it will fail to attach and there is a way to check to make sure that chroma is running and that it is also performing properly there have been reports where people have been unable where like everything seems like it's working but the API isn't working so the client never works this is an easy easy way to check open up a new tab in your browser and go to localhost 8000 slash API slash V1 and you can do slash heartbeat and what this should return if everything is working perfectly a nanosecond heartbeat with a long Epoch number if you do not see this result chroma is not running locally and you will not be able to successfully create edit manage collections on your chroma instance if you do not know what is wrong with your chroma instance the easiest thing to do is go to the server and then go to open details and go to the logs and it should say here and you'll see it actually logs our request here now just to be transparent about this one thing that they are up front about uh in the documentation is there is telemetry enabled for Chroma all this is is the ability for them to just know hey are people even using our service they don't have access to your data they don't have access to the vectors that you're inputting or anything like that and I implore you to keep Telemetry on because open source projects really live and die on use and it's just something like that of course if you don't want to enable it you can turn Telemetry off it's in their documentation I'm just advocating for you know keeping open source projects open and giving them an incentive to do so also it helps them build a better product now we have chroma running locally congrats connect this to a client plug it into anything llm which you can do actually from let's go to the anything llm Discord or not Discord sorry GitHub and if we go to the code for this and we go to the EnV example for for the server all you would need to do is comment out Vector DB equal to caroma comment everything else and then have chroma endpoint equal to Port localhost 8000. this is great for testing chroma also this is persistent so if you shut down this Docker container or you add millions of vectors and then you pause the container or shut it down as long as you don't delete it from Docker when you turn it back on all of your vectors will still be present it's awesome we love that we love persistence because what's the point of writing a one-time use Vector database now on to the next part of this video for the second part of this video I'm going to show you how chroma recommends that you deploy a private cloud-based instance on AWS now why would you want to do this well for one thing if you are running a application or a server that is running on versel or maybe it's running on render or even your own AWS instance or maybe you're actually running a local project but you also want other people to be able to connect to your vector database the vector database can clearly not be stored on your local machine I technically you could use engrock but we're not going to get into that basically you would want your database to be in the cloud so that other people can reach it that's how most databases work so let's just go over the current state of affairs today is June 11th now chroma will eventually have support for a hosted solution where basically you can do like a one-click deployment of chroma it'll be production ready all of that stuff but you want production today so this is how we're going to do it so right now things are in Alpha so expect some things to break expect things to be a little wonky but things should work and I've tested this also it does work there are two there's really only actually one thing you're going to need and you need an AWS account and a little other piece of information to know is that AWS account must have billing information associated with it because unfortunately you cannot run this service on the T3 micro AWS instance which qualifies for the free tier this is a little bit chunkier because there's a kind of like a lot going on you need at least two gigabytes of RAM and also you need some solid state disk storage because obviously where's the data getting stored that you're sending it the vectors I mean um so there's actually a really really easy easy way of doing this and they tell you that this simple uh AWS deployment there is something absolutely critical that you need to know this stack has no authentication if you publish this link in a Discord or you send it in a chat or put it in a Reddit thread everyone will know what your chroma instance is and they can connect to it and then have their clients send the reset command and reset your entire database that is obviously not something you want so part two of this video series with chroma and deploying chroma we will actually show you how to launch a private chroma instance but I want to show you what is easy so that just you can get started with Cloud chroma is what I guess I'm going to call it by the way I don't work for the chroma team so if the chroma team sees this and they think I am wrong about any of this let me know um but just for me to you as the viewer I want to help you I want to get chroma in your hand so that you can build awesome chat applications or other llm applications I'm excited about it so that's why this video is being made change it aside we know that this instance we're about to deploy will have no authentication and I'll even show you how that's possible it'll get a little bit into it trust me it is worth the wait now the first thing you need is obviously the AWS account they recommend deploying it through the AWS CLI I hate the lack of visibility that the AWS CLI has so we're going to do it through the console itself because the UI is always just a little bit more clear about what the hell is going on so you need a thing called a template they publish this template right here in this command and if we open it in fact actually I believe I have a copy of it uh let's go to let's go to my downloads I think I have it right here we're going to open it in Visual Studio code just to show you that this isn't anything suspicious this creates basically a series of resources that allow you to run chroma on an instance on AWS so you'll see that they actually you know like they basically go through they say okay we're going to provision it in this region um and at the end of this result we're going to get the chroma instance and our public IP address because that's how you're going to connect to chroma and if we go into the security group here you'll see that SSH is available and we open up Port 8000 for traffic or else we're not going to be able to talk to our own instance um and also if we jump into here this is when we create the T3 small instance which is qualified right here and we're on version 3.2 0.3.26 we send all of these commands what do these commands do they install Docker and then they give it the right permissions they start Docker and then we pull in the latest chroma image and give it all of this uh basically Docker file uh settings close the file go to it create this backup disk XML file which is apparently required and then you know Docker compose dot f spin it up and yeah and Bob's your uncle and now you got chroma running there's nothing suspicious going on in this file is really what I want to drive home here right now um so let's deploy it keep in mind they estimate that this entire process just if you left it run for an entire month would cost about 15 a month if you pause the instance you don't have to pay for it so when you're not using it pause it just a good way to save money so what you want to do is I have an AWS account this AWS account is clean it has nothing in it so there's nothing conflicting there's no assumed information here you want to go to cloud formation and you also want to make sure you are in the region you want to be in so for me I'm in California I'm going to choose Northern California closest data center to me we're going to go to Stacks we're going to go to create a stack we have a ready-made template and it is in fact a S3 URL and so you'll see that we can pull it in there and we're going to click next and then we're going to enter in this and we're going to say this publicly public chroma instance or public chroma stack let's call it the stack and you'll see that some of the parameters that are defined are 0.3.26 if for example chroma comes out with a new version you may want to update this template to say 0.27 or 0.3 or whatever it might be instant size T3 small if you want to try and run it on a smaller instance you I do not think it will work at all just use the T3 small and we are not going to pair an SSH key to this instance because we just I just don't care about sshing into it if you have a key pair enter in the name of that private key that you made through your AWS account if you don't have that you don't know what the hell that is you just click next oh sorry uh this needs to have dashes uh hyphen sorry spaces are not allowed um you can tag it if you like if you want to give it a permission you are allowed to do so um and we want to have it to where when we deploy this if something goes wrong just undo everything you just did just trash it all nuke the entire operation um other than that there are just other policies that you might want to add if you know what you're doing great if you don't don't touch it that's pretty much the greatest rule of thumb for AWS um and so yeah it gives us a little overview right here we're going to use this template the template says it's going to deploy a stack it's what the Json says we got some variables here tags permission stack failure policy notifications whatever and also this is cool this little quick create link uh basically if you somebody else is just like hey I just I just want to click this button and get a stack this is the way to do that um you can use this AWS CLI if you want to now this will actually run fairly quickly um if we go into actually it brings us automatically sorry to the events page you'll see that we are creating a public chroma stack and the create is in progress this takes pretty much like five to ten minutes to do um it's actually really not that intense on resources so we'll wait till that's done okay so we have landed you see that uh create complete is here I'm gonna refresh this you'll see all the events that popped up um it gave us our chroma it created a chroma security group that opened up Port 22 and 8000 um and then yeah and now we have our public chroma stack um and we go back to Stacks create complete after everything has been created though you're probably wondering uh what's the IP like how do I reach my instance like I have no idea where it is select the output Tab and you'll see that the server IP is [Music] 13.52.212.170. I don't care if you know this IP because I'm turning it off after this video in fact actually we'll delete it in this video now if we go to this IP nothing will happen now why does nothing happen because we're in a browser we're trying to hit Port 80 Port 80 as you may have noticed from the security group is closed so what we actually have to do is go to Port 8000 specify it and you would do this in your client as well and then we're going to do API V1 heartbeat just like we did with our local instance and now we have our Nano second heartbeat again if you did not get this Json response something is broken and you may actually need to go and SSH into the actual instance itself now everything is running if you took this URL and put it into your chroma client you are fully online with a remote hosted chroma instance awesome however if you went to this URL right now if somehow you were watching this video and you knew that this URL was alive and accurate you could create a chroma in a chroma client use this exact URL and then just reset my entire database and that's not cool and actually the documentation for Chroma even say if you put it behind an API Gateway you actually might still have trouble authenticating this resource let me explain so this IP is fully available right however uh this IP will always be available because Port 8000 is available and even if I went to uh even if I went and deployed an API Gateway that basically proxied all of my requests to another endpoint and basically forwarded it to my instance if somebody still knew the instant IP they could reach it and they could reset it because of just how this whole thing works so part two of this video will actually show you how to set up a totally private chroma instance so that you can have an instance that is out there and secure and will only approve authenticated requests with an API token however if you want to put an API Gateway in front of your chroma instance and just hope to God that nobody ever finds out your public IP there is a way to do that and it is quite easy and I'll show you how to do it right now so we know what rip was uh we will grab it later but the first thing we need to do is create an API Gateway now the Gateway should be in the same Zone as your instance which by the way just to show you that the instance is running we go to our instance and we see that we have a T3 small that is named this and you know what just to for sanity's sake we'll call this public chroma so that we know that this instance is for our public chroma now we'll go to API Gateway and what we're going to want is I believe a rest API yes okay and we're gonna want a new API and we're going to call this uh public chroma API and see if we can create that and we are here now what we're going to want to do next is do create resource this will be a proxy and then we can just go and create our resource and this is going to be an HTTP proxy where we basically can paste in the IP of our uh instance that is running public chroma don't forget Port 8000 and then anything that comes after the slash just pass it through doesn't matter and you can do a default timeout uh if you'd like um we'll click save and now you'll notice that we are sending basically a request uh and let's just test it to make sure we'll do a get on API V1 um by the way APS API V1 and API slash V1 heartbeat same thing um so let's test it you'll see we get nanosecond heartbeat right away that's awesome but all we have now is all we're going to have is an API that just anyone can hit so to show you how that works first we're going to deploy the API we'll just have a stage we'll call it a new stage and we'll just call it Dev um because that's what we're doing so we'll do Dev now we have a URL where nobody knows what the actual server IP is so this is kind of what they call security through obscurity uh basically you're actually not safe here because and I'll show you how if we go to this and then we go to slash API V1 we get our nanosecond heartbeat however if we go to our server and then we go to http we can still reach it so either going through the API or just hitting the instance directly we get the same result this is not a secure deployment that being said if you're comfortable with this by all means it is very very inexpensive to run this is by far the like easiest way to set this stuff up the totally private instance will definitely run you a bit more per month not an insane amount but it is also very very much more in-depth and complicated lastly we want to have an API key let's just say you're one of those uh you want to have basically just more Security even though your public IP is still visible uh let me show you how to add an API key sorry to your Dev instance API endpoint thing all right so the way to do that is first we need an API kit so let's create an API key we want to have a general run we'll just say this is General we're going to Auto generate it all right click save we now have it this key by the way which I'll be deleting so you can see it anyway um this key right here is actually not being used right now we need to add it to a usage plan but we don't even have a usage plan so now we have to create a usage plan it nothing on AWS is really ever that easy uh I wish it was so we're going to create it and we'll have it called basic you can Implement throttling here for example let's say you want to share a database across your organization using this method you might want to give each organization its own API key because some people should not be hitting it as often as others for example reading only versus writing versus read write now we're going to go to disable throttling and disable quota which is not a safe way of doing things but it is easier and we're going to add an API stage for public chroma API Dev that is the one that we have deployed keep in mind you have to have deployed at least once to be able to do this and then we're going to add an API key to this which I actually think um I think it was called General but there's an easier way to do this actually if we go to usage plans did it not create my usage plan oh no it did okay if we go to API Keys we can add an API key and I believe we can call it I think it was just called General right why won't it show it there it is okay that was weird it wasn't a drop excellent AWS UI use cases here um okay so now we have the general key associated with our stuff however we are actually still not using the API key on our API Gateway to do that we actually have to go to apis go to the API we just created the API Gateway click on the proxy method we want to use this API key for all methods if you want for whatever reason for people to be able to read all your data but not write to it which is actually a very common use case um you would just extract the get method or the delete method or whatever um this is not an AWS tutorial so we're going to go to Method request we're going to say that we do actually require an API key the usage plan an API key that we just did are already associated with this Gateway so we're good um now we have made changes we need to redeploy the API and actually before we even do that this this data is saved by the way let's go grab our API key because I want to show you that right now it doesn't matter for something like this we're actually going to need a uh something like Postman because if you don't have Postman you won't be able to use this um you won't be able to add an API key in your browser request that's why we have to do this so first let's go to Res sorry let's go to stages that's where we need to go stages go to Dev here is our invoke URL we're going to copy that slap it into post man do a get request and we're gonna do API V1 this by the way we have no key or authorization attached we send it nanosecond heartbeat what we want to do is have it to where these requests fail unless a key is provided so now because we already made those modifications on the proxy endpoint API key required we will now deploy the API to the dev stage so that that URL now is expecting and requiring authentication for all methods this uh process by the way takes like a couple minutes so it may fail instantly it may let you there we go okay so it worked for one request and then the next one I sent I'm now forbidden um how do you use an API key in your chroma instance it's a very good question um at this time I'm actually unsure if the library supports uh adding authorization keys but the way to do it if you needed to manually connect to your chroma instance for whatever reason um you would go to API Keys you would go to here you would copy your API key and add a header called x dash a p i key I think this is actually case sensitive but I'm not exactly sure so don't quote me on that and now we are able to get our response again if we were to go to slash collections we should get an empty array and yeah so that's it we now have an API Gateway in front of our chroma instance that is running on AWS where we have an authentication key also associated with it you can run it publicly if you want I highly do not recommend that and there is kind of a glaring security hole here where people can actually go and connect directly to our server instance and bypass all of the security that we just set up through the API Gateway however we'll get into that in the second video now of course the last thing we need to do is tear everything down how do we delete all of the stuff we just made the API the API Gateway which we set up outside of the actual authorized or pre-built uh cloud formation stack that chroma gave us you actually have to go and delete the API because it will not get deleted when you remove the stack now if we just go to cloud formation you will see we have our public chroma stack just click delete all resources that were generated that's the instance that's the security group and that's pretty much it um these two things once they are removed which by the way if you just refresh this it says Fleet in progress uh within a couple moments it'll be gone and actually this instance will also be gone it's already gone uh because it is no longer running um so yeah that is it that's how you create and tear down and even put an API Gateway in front of the chroma instance all using the AWS console because it's just a kind of simpler way of doing things than using the CLI um it would be great to actually Fork a version of the cloud formation template to add this in by default but those templates get very tricky very quickly and I frankly just don't have any interest in doing that uh hopefully this video helps you as AWS is uh in six months this video will probably be irrelevant because they'll have completely changed their interface before six months it'll be useful so that's it uh next video will actually be how to properly secure and gate your chroma instance to where nobody can access it except for you through the API Gateway that's next that's more powerful stay tuned thank you
Info
Channel: Tim Carambat
Views: 8,230
Rating: undefined out of 5
Keywords:
Id: xRIEKjOosaM
Channel Id: undefined
Length: 29min 9sec (1749 seconds)
Published: Sun Jun 11 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.