Python AI Agent Tutorial - Build a Coding Assistant w/ RAG & LangChain

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video I'll be showing you how to build your own custom AI agent using Lang chain and retrieval augmented generation now we'll build this out using Python and even if you're just an intermediate python user you can still follow along you don't need to be a complete expert what we'll do here is actually query information about one of our GitHub repositories and then make an AI agent that acts like a coding or GitHub assistant that means they can summarize different issues it can actually respond to them if we wanted to do that and it can access different tools that we give it so I'm on the computer now and I'll give you a quick demo of how this works now the first thing the agent is going to ask us is if we want to update the issues from GitHub that's because we're going to go to GitHub and grab all of the different issues from whatever repository we select but we don't always need to get fresh issues maybe only every day or every few days we actually want to do that update so in this case I've already updated the issues recently so I'm just going to press no and then what it's going to do is allow me to ask some questions about the GitHub repository or to utilize some different tools that we've given it access to so it says ask a question about GitHub issues so I'm going to say something like hey can you tell me what and we'll do like this what issues people are having related to flashing messages okay so what it's going to do now is show us the thought process of the agent you can see that it goes in queries in our tool which is the GitHub Vector search database we'll talk about that in a second for flashing messages it then retrieves all of the messages that actually fit that criteria and then it gives us a summary of them so it says here are some of the issues related to flashing messages that people are facing and then it spits out the different issues now what I can do is I can actually tell it to make a note of this and it will utilize a different tool that it has access to to actually save this information onto our local computer you'll see what I mean but the agent has various tools and based on what we ask it to do it can reach out and utilize those different tools it's really really cool and you can expand this and make it really awesome so I'm going to ask the same now but I'm going to tell it to save a note so I've just said can you summarize the GitHub issues related to flashing messages and then save them as a note so let's type this in now you can see the first thing it's going to do is use this tool GitHub search with the query flashing messages and then you can see it's invoking the note tool and it saves all of the different uh kind of summary of what do you call it issues so if we go here this is the note that it just saved it says flash messages not showing in the login page and alert message in homepage and then it gives us a description of the issues that people are facing this is just a really small demo showing you what's possible in terms of building an AI agent obviously you can build this out and make it that much cooler and just as a final thing here I'll show you that this is the GitHub repository it's working on and if I go to issues you can see that one of the first issues we have here is related to flashing messages so you can see that's exactly what it found and then summarized for us and stored as a note we could expand this and we could make it automatically reply to issues we could even make it write a poll request to solve certain issues that people are having with that said let's get into the tutorial and let me show you how to build this out so we're going to start building this project out now so what I've done is opened up VSS code I've opened a new folder and this is where we'll write all of our code now the first step is to install a lot of different packages in dependencies and then what we're going to do is retrieve the different issues we need from a GitHub repository this can be your own repository or a public repository I'll show you how we connect that and set that up then we're going to take those issues we're going to store them in something known as a Vector store database we're then going to build out the AI agent which will have access to the database and it can query and look up specific issues really really fast that's basically the process so we need to start with GitHub and getting everything set up so let's do that right now so what I've done is invs code here I've opened up my terminal and I'm going to create a virtual environment to install the various packages that we need now in order to do that we're going to use the command Python 3-m venv and then we can write the name of our ver viral environment in this case I'm just going to call it GitHub now if you're on Windows you can change the command to Simply say python so this is going to create a new virtual environment for us if we spell python correctly and then we can activate the virtual environment with the following command now if you're on Mac the command is going to be source and then GitHub SL bin SL activate like that and then you'll see that we get this GitHub prefix in our terminal if you're on Windows the command will be slightly different and I'm going to leave the command on screen you may also just have to look it up because depending on the command prompt or Powershell or whatever it is that you're using the command could be slightly different anyways I'll put something on screen and then again you might just have to look up exactly how to activate it because it differs based on the shell that you're using regardless once the virtual environment is activated we can install the different packages now if you want a little bit of a cheat you can go to the link in the description and I have all of the finished code for this project already there what you can do is copy the requirements.txt file you can put that inside of this directory and then you can install all of the requirements there or you can follow along with what I'm about to show you here where we just install the different packages manually so what we're going to do now is type pip install and then we're going to write the different packages that we need now the first package we need is python let's make sure we spell that correctly d.v for loading in environment variables we then need the request module this is so that we can send requests to the GitHub API to get different issues we then need the Lang chain package okay this is how we're going to actually write our agent we then need the Lang chain dastra DB this is because we're going to use astrab as our Vector store provider again we'll talk about how that works in a second we then need Lang chain and this is going to be open Ai and then Lang chain and this is Hub okay so let me just double check that those are correct correct looks like those are good to go so again python. EnV requests Lang chain Lang chain astrab Lang chain open aai and then Lang chain Hub we're going to install all of those in our virtual environment and then we should be good to go all right so our virtual environment is set up so now we can actually start writing some code but what I'd like to do now is I'd like to get all of the different tokens and credentials that we're going to need for this project and just get them set up so we can do all of the coding at once and we don't have to keep going back and forth between our browser and different websites so what we'll do in VSS code is we'll create a EnV file now this is an environment variable file where we're going to store all of the sensitive credentials that we're going to need because we're going to be connecting with various apis we're going to be connecting with the database and we're going to be connecting with open AI so bear with me here and we're going to write out a few different keys that we're going to be filling in in the next few minutes so we're going to type GitHub token and this is going to be equal to an empty string we're then going to say Astra and this is going to be underscore d bcore API uncore endpoint is equal again to an empty string we're then going to say Astra _ DB underscore application underscore token is equal again to an empty string we're going to say Astra uncore dbor keyspace is equal to an empty string and then open aior apore key is equal to an empty string so we're going to fill all of these in with our own credentials for astrab this is the vector store database provider this actually comes from data Stacks which is the sponsor of this video don't worry they're free to use and I'm going to show you how to set that up in just a second so the first thing we're going to do is get our GitHub token this is going to allow us to access the GitHub API so what I'm going to do here is go to my browser I'm going to go to GitHub which conveniently I already have open I'm actually just going to open this in a new tab so we can save that page and what we're going to do here is click on our little icon and we're going to go to settings from settings we're going to scroll down here to the developer settings on the bottom left hand side and what we'll do is we'll go to personal access tokens and then tokens classic from here we're going to create a new token I've already created it so I'm not going to make a new one here but you can simply press this button make sure you create a classic token and then copy the token into the location that I showed you all right so you should have copied that token and then you just want to paste it right here where it says GitHub token don't worry I'm going to invoke this token after the video so you guys cannot use it and the next step is to get all of our information for our Astro DB database so what we're going to do now is go to this site right here which is the data Stacks website again thank you to them for sponsoring this video and providing this database for free for all of us to use you can simply click the link in the description and you'll be able to view it now let me quickly explain what this actually is and why we need it for this video so we're going to be building a rag application retrieval augmented generation and what that means is that our AI agent is going to have access to in this case a database something called a vector store database where it can really quickly look up in query information now data Stacks is going to be providing that database through their product known as astrab this is a very fast Vector database that allows us to really quickly look up information based on similarity so rather than dumping all of the different contents of our GitHub issues to our agent which it might not even be able to handle because it could be so much information what's going to happen is the agent is going to utilize this Vector database it's going to query for a certain piece of information and then it's going to be able to look up based on similarity this is what the database is doing all of the pieces of information that are relevant to the query and then return just that data to our model so it can utilize it and give us some results based on it so this is kind of the basic setup for rag usually we are using a vector store database and the way this works is we have vectorized information that allows us to search for information based on context based on similarity and that's much faster and more efficient than using a traditional SQL database so I don't want to explain it too much you can look up Vector store databases if you want to learn more but Astro DB has a free one that we can mess with for this video and obviously they have paid plans as well if you want to scale this up to production so what we're going to do is click on try for free here and if you don't already have an account make one in my case I'm just going to sign in we're going to make a new database and I'm going to show you how we can configure this all right so once you've created your account or signed in you'll be brought to a page that looks like this where what we're going to do is click on databases and we're then going to create a new one now you can see I already have one that I was messing with when preparing this video what we can do is click on create a database we're going to go with the serverless vector database which is the one that I was talking about and then we're going to give this a name now in this case I'm just going to call this the GitHub agent and then we need to choose a provider and a region so I'm just going to leave it as Amazon web services and I'll go with us East 2 obviously if you're upgrading to the premium version you'll get some more options here but for case this is completely fine and we'll be able to do everything we need without paying for this service okay so I've just done that here you can see that it's initializing the database this will take a few minutes once it's done I'll be right back and I'll show you how we retrieve the information that we need to connect to this from our code all right so the database has been created now and what we're going to be looking for is our API endpoint and our application token so what I'm going to do is copy the API endpoint here we're going to go back to our code and we're going to paste that in and then we're going to do the same same thing for our token so we're first going to press generate token it's going to give us one let's copy that obviously you don't want to leak it like I am and then paste it here and for the key space if we can go back here this is something that you don't actually need to fill in I'm just putting it here in case you do want to enter this information you don't really have to concern yourself about what that is okay next we need our open AI API key so that we can connect to openai and use chat GPT so in order to do that we're going to go to platform. open.com / API keys I can leave this link in the description you also can probably find it pretty easily just by going to open aai and we're going to make a new key now I believe at this point you do need to actually add a credit card to open AI in order to utilize this it should only cost you a few cents if anything it might even be free but I just want to make you aware that you do need to have some kind of payment method here I believe as of the last time I did this in order to use this API there are other ways to run this project locally using something like oama but we're not going to be walking through that in this video so I'm going to click on create new key I'm just going to call this my GitHub agent okay it's going to give us a new secret key here and we're going to copy this and then same thing go back to our code and paste this for our open a open AI API key that is always a mouthful okay so now we have all of our tokens and we can stay here in vs code and actually write the completed project so let's close our terminal and let's make a new file here called and retrieve the different GitHub issues so inside of here we're going to start with our Imports we're going to say import OS we are going to import requests we are going to import actually we're going to say from. EnV import and then load. EnV and then we're going to say from Lang chain core. documents import the document okay we're going to use that in a second that's how we're actually going to wrap our GitHub issues and then pass it to our Vector store database now the first thing we're going to do is call this load. EnV function and what that will do is search for the presence of a EnV file and then load all of these environment variables uh for us so you need this to actually load those different variables so that they're in the Python program and we're able to utilize them the next thing we're going to do is grab our GitHub token so we're going to say GitHub unor token is equal to os. getet EnV and then we're going to grab the name of that variable which is GitHub token which will now have been loaded from this function okay next what we're going to do is write a function that will utilize the GitHub API you don't need to set anything up here you just need this token in order for it to work so we're going to say Define fetch uncore GitHub we're then going to have the owner the repo and the Endo of the information that we want to fetch now what we're going to do is craft a URL here so we're going to make an F string in Python and this is going to be https called sl/ api. github.com SL repos SL and then we're going to put inside of curly braces here the owner the repo and the endpoint so you can see that we're making this Dynamic let me just make this a bit smaller so we can see it uh so that we can load any kind of repo that we want we just need to pass in these different parameters so that's how we're crafting the URL and the end point would be something like the issues or a readme file or PLL requests we can grab grab different pieces of information in this case we're just doing the issues but you could really easily adapt this code to grab all different kinds of info from your GitHub repository and then pass that to the agent okay next thing is we're going to set up our headers so we're going to say headers are equal to and we're going to create the authorization header which is what we'll need in order to actually send an authorized request to GitHub so we're going to say authorization and then this is going to be an F string and this is going to be Bearer and then it's very important that you have a space and then curly braces and then the GitHub token this is how we pass our authentication token so that we're able to actually send this request and get back a valid response next what we're going to do is we're going to craft the request so we're going to say response is equal to request.get we're going to send this to the URL and we're going to say that our headers is equal to the headers that we've just set up now what we're going to do is check the status of this response so this will send a request to this get URL and give us some information back so we're going to say if response. status code is equal equal to 200 then we can say data is equal to response. Json so that's how we can actually load this this will load it as a python dictionary the data we got back otherwise we're going to say print and we'll just say failed with status code and then we can print out the status code which will be response. statuscode okay then we can just return an empty array here otherwise we'll come down here and we're going to return our data okay perfect so now that we have our fetch GitHub function we're going to write a function that will call this and then get all of the results and wrap them as a lang chain document which we're going to need for the next step which you'll see in a minute but for now what we can actually do just to test this out is we can print and we'll just print the data that we got before we return it so that we can see what that looks like so let's call this function so to call the function we're going to say fetch GitHub and then what we need to pass to this is an owner a repo and an endpoint now what are the owner repo and endpoint going to be well for the owner this is going to be Tech with Tim because that's my GitHub account obviously you can change this to your own for the repo I actually need to find what the repository name is so that I don't mess it up okay so the repo is going to be our flask web app tutorial I believe that you guys should be able to put this in as well because this is a public repository and then the endpoint is going to be equal to issues so now I'm going to pass this owner repo endpoint and we can run our code and we can test to see if this is working so let's clear and go Python 3 pretty difficult to see but the main important thing to understand is that this is inside of a list and what this is doing is giving me all of the different issues with a bunch of different parameters and properties about them now you can look through this request and you can parse it however you want but just trust me in the next step here because I've already looked through it and found the important keys that we want to extract so what I'm going to do is close this and now let me just make this a bit smaller so we can see it we're going to write a function that will take the results here and then wrap them as a l chain document so we're going to say Define and this is going to be the following we're going to call this load issues so we're going to say load uncore issues this is simply going to take in the different issues and then as I said it's going to parse them and load them as a document that we can actually use in our retrieval augmented generation program so we're going to say docs is equal to an empty list we can make this bigger again so we can see it okay let's go up here we're then going to say for entry in issues remember that issues is a list of all of the different issues that have a bunch of different Keys now what we're going to do is create some metadata so we're going to say metadata is equal to and we're going to make a python dictionary the first piece of metadata we're going to want is the author so in order to get the author from our issue this is going to be entry user and then it's called login again this might seem like gibberish but I've gone through the issues and these are all of the different keys that are in that Json or that are in that python dictionary that we need to access next we're going to have the comments we probably want that as metadata as part of the issue so to have that we're going to say entry and then comments this is a list of all of the different comments that then contain some more information we're then going to have the body of the issue which is the main text description so that's going to be entry and then body we'll then have the labels so maybe they've labeled this and that can help us get some context about the issue that'll be entry and then labels and then lastly here we're just going to get the created at date so we're going to say created at and then this is going to be entry and then created at so this is metadata that will store with all of our different documents there is a lot more information associated with the issues but this is the important stuff that we're pulling out next we're going to say data is equal to entry and then title so we're going to get the title of this entry or of this issue because that's the main thing that we want when we're actually indexing or looking up these issues and then what we're going to is we're going to say if entry and then body so if we actually have some body text through a description because not all issues do then we're going to say data plus equals the entry of the body now what this is doing is it's taking that description and it's concatenating that to the title so that when we add kind of the main content of our documents which you're going to see in a second we have both the title and the description so we have more information so our model can perform better so I'm going to say Doc is equal to document and then what we're going to do is say the pageor content is equal to and then this is going to be the data so what we've just crafted right here and then we're going to set our metadata so we're going to say metadata is equal to metadata so when we actually search in the database for this information we're mostly going to be looking in the page content and the metadata is additional information that's provided about the document okay so that's why we've taken the title and the body and we've combined them together for the page content uh argument okay now that we have this document crafted we're going to say docs. append and we're going to append this document and then from this function here we're going to return the different docs okay we're almost done we're just going to write one more function here this function is going to say Define and this is going to be fetch unor GitHub issues and what we're going to do is have an owner and a repo and inside of here we're going to say data is equal to fetch GitHub and we're going to fetch the owner the repo and we're manually going to specify that we want the issues endpoint and then we're going to return load issues okay and we're going to pass in here the data okay so all I'm doing is writing one function that combines these two functions together it first fetches the GitHub information that we need it then loads the issues and then returns all of that for us great now that we have that what we're going to do is write another file this file is going to be called main.py where we're then going to import this code that we've ridden and we're going to use it to actually load this in and store it in the vector store database all right so we're inside of main.py and we're going to start with all of our Imports there are actually quite a few and then we'll write the main kind of chunk of code here again now at this point I know we haven't really seen it or tested it we're fetching all of the different GitHub issues from our repository and we're loading those as documents that we need to actually store in the vector store database now we need to connect to the vector store database and save that information and then we can move on to writing the agent which will have access to this database as a tool so we're going to go to the top of our program and we're going to say from EnV uh import load. EnV we're then going to import OS and then we're going to import a bunch of stuff that we need from Lang chain so we're going to say from Lang chain openai import and this is going to be chat openai and open a I embeddings we're then going to say from langing chain asrb and we're going to import the Astra DB Vector store by the way don't worry if I'm going too fast for you you can always copy all of this code from the link in the description or feel free to pause the video or slow down the playback speed we then are going to say from Lang chain. agents and we are going to import the create and this is going to beore agent uncore is this actually what it's called no sorry create tool calling agent that's the name of it we're then going to say from Lang chain do agent import and this is going to be the agent executor we're then going to say from Lang chain. tools. retriever import and this is going to be the create retriever tool we're then going to say from Lang chain hub import and this is going to be Hub and then we're going to say from GitHub which is the file that we wrote import the fetch GitHub issues okay now we're going to call the load. EnV function so we load in the environment variables inside of this file as well and now we're going to connect to our Vector store database okay so what we'll do here is we'll write a function called connector 2or Vore standing for our Vector store inside of here what we need to do is create something known as a set of embeddings I'll talk about that in a second and then connect to the astrab database provided by data Stacks now in order to do that we'll load in the different keys that we have in our environment variable file we'll create a new instance of the vector store connect to it and then I'll show you how we utilize it okay so let's create some embeddings we're going to say embeddings are equal to the open Ai embeddings and what an embedding is is a way of actually kind of turning turning a piece of data in this case textual data into a vector now a vector is something that exists in multi-dimensional space and this is what we're actually storing inside of the vector stor database so these embeddings will be used by the vector store database to first take our textual data or our documents convert them into vectors and then we store them inside the database so that's why we first need to use these embeddings now these are provided by open Ai and the reason why this function here is going to work is because we've specified inside the environment variable file our open AI API key if we didn't specify that this code wouldn't work because in the background it's looking for that special environment variable so we're using open AI here to provide us these embeddings which is kind of like a mini machine learning model which is going again and converting that data into vectors which we can then store inside of the database okay so the next thing we need to do is load in our Astra DB credentials so we're going to say Astra DB API uncore endpoint is equal to os. get EnV and then we can simply take this and we can paste it inside of here as a string okay we then need our Astra DB uncore application unor token which is equal to os. getv and then same thing we can take this and we can paste it inside of here as a string and then we're going to have the desired undor name namespace equal to os. getet EnV and then same thing this is going to be the Astra dbor keyspace okay this is going to be blank we don't actually need to Define this um there's some more advanced configurations where you can again we don't need to talk about that right now so I'm going to say if the desired name space actually exists then we're going to say the Astra dbor keyspace okay let's write this out is equal to the desired name space otherwise we're going to say the Astra DB key space is equal to none the reason for that is when we load this in it's going to give us an empty string an empty string is not actually a valid option so what we need to do is check okay is this an empty string if it is or if it isn't sorry then we'll just actually set it to this astrab key space otherwise we'll set it equal to none so that astrab knows not to use this option okay now we're going to say the V store is equal to the as ra DB Vector store and inside of here we're going to pass our different options so we're going to pass embedding is equal to the embeddings we're going to pass the collection name which is kind of like the name of the database or the table in this case it's uh non-sql database it's not really a table but it's called a collection where we're actually going to be storing our data in this case we're going to store it in something called GitHub but you can change the name of this if you want we're then going to set our API endpoint equal to the Astro DB API endpoint and then we're going to have our token equal to our Astra DB application token and then the Nam space oops did not mean to do that let's close all that equal to the Astro DB keyspace okay then lastly we're going to return our V store and that is our function to connect to the vector store now what this will do is it will look to see if this collection exists if it doesn't already exist it's actually going to automatically create it for us so just something to keep in mind there The Collection name GitHub if it doesn't exist well it will just make it for us when it connects to it okay so now what we're going to do is we're going to write a function and we're going to say or sorry we're going to write a line and we're going to say V store is equal to connect to V store the next thing we're going to do is we're going to ask the user if they want to update the issues the reason why we're going to ask them that is we don't always need to update the issues in the vector store it's only maybe every day or every few days that we're getting new issues on our GitHub repository so it's inefficient to constantly do this so we'll just manually tell the program when we want to update it and only then will we actually hit the GitHub API and then do the operations with our database so we're just going to have a line here we're going to say add to Vector store like that is equal to an input statement and instead of here we're going to say do you want to update the issues question mark and then we're going to put our options y comma or y/ capital N the reason I'm doing a capital N is that's the default option I'm then going to convert this to lower and I'm going to check if this is in the list of yes or Y so all I'm doing is I'm getting the input here and I'm saying okay if what they typed in as lowercase is in yes or Y so if it's one of those two options then that means yes we do actually want to update the vector store or add our issues to the vector store so then we'll go ahead and do that okay so if this is the case so if we are adding to the vector store then what we're going to do is we're going to call the function that we had here which is fetch GitHub issues so I'm just going to copy all of the code that we had here because we no longer want that inside of this file and we're going to paste that inside of here so we have our owner and our repo and our endpoint and we don't actually need the endpoint here because we're going to be using a different function this function is fetch GitHub issues where we don't need to pass the endpoint we just pass the owner and the repo now with this returns is a bunch of different documents again these are the documents generated by Lang chain which we can then use to add to our Vector store okay so now we need to add them to the vector store so to do that we're first just going to try the following we're going to say V store. delete collection we're going to say accept and then pass now the reason I'm doing this is that if we are going to update the vector store we first want to clear any issues that are currently in there in case for example an issue was deleted or resolved or removed since the last time that we updated it so we're just going to clear the entire collection so remove everything that we currently have and then add all of the new issues that we've grabbed into there okay so we're going to go here and we're going to say Vore is equal to connect to V store and then we're going to say vor. addore documents and we're just going to go here and add our different issues and actually sorry issues is equal to fetch GitHub issues okay let me quickly run through this because I want to make sure I'm not confusing any of you we have our owner we have our repo we go and we fetch our different GitHub issues we store those as issues we then try to delete the collection in case it already existed now what I'm doing after this is I'm reconnecting to the vector store and the reason why I'm doing this again because you notice I have it up here as well is because if we did actually delete the collection then that means that we need to connect to it again so that it's automatically recreated so you can see I have my collection name GitHub if I delete it and then I just try to continue to use the vector store well we're not going to have that GitHub collection so I try to connect to it again so in case it was deleted it will automatically be recreated from right here and then return to me so I can start using it so it's slightly inefficient but it's just kind of the most or it's kind of the best way sorry to actually go about recreating the collection if in case we deleted it okay so that's going to add all of the documents but now we want to actually test and see if this database is is working and if it can actually look up any of the GitHub issue information so in order to do that we can write the following line we can say results are equal to and this is just some test code we can delete after and we're going to say Vore Dot and then we can use this similarity search and we can pass inside of here something like flash messages and we can say k equals 3 which is the number of documents that's going to be returned to us so we could return more if we want in this case we'll just get the three most similar documents to flash messages so we're doing a vector lookup here on all of the different issues okay so now what we're going to do is say for res in results because this will be a list of various documents we're going to print the following and this is going to be an FST string okay inside of here we're going to put an asterisk and then we're going to say that this is res. Page uncore content and then we're going to put here this is res. metadata okay so that should be good and now if we run our code and we type yes or why it should go to GitHub fetch those different issues and then print out to us the various results so let's test this out before we go any further let me get out of this however I ended up doing that let's clear this I'm going to type Python 3 main.py and we'll see here it's going to give us some warning we can ignore that for now and let's give it a second here to connect and load everything up and we got some issue it says cannot import the name hub from Lang chain Hub okay let me see what the problem is here and then we can fix that all right so it's actually not the import of Lang chain Hub it's just from Lang chain import Hub my apologies guys on line nine there we'll fix that up and we'll rerun our code now and hopefully this time it will work all right so we have the prompt appearing here so I'm just going to type in why for yes so that we can update the issues and we can test this code okay so you can see that that worked extreme fast you can see that it printed out all of the different GitHub issues okay so actually we're not yet at the similarity search this was just me forgetting to remove that print statement from the GitHub code that we had so no problem but it did retrieve these different issues and it's worth noting by the way this only retrieves 30 issues at a time we could build in some pagination and we could get all of the different issues but in this case we'll just go with the 30 most recent ones which is what it's giving us and then the similarity search should work in a second but it does just take a minute to actually connect to the uh what do you call it dat datase and then do that search operation okay so we got an issue here that says document object has no attribute met data the reason why we got that issue as well because we spelled metadata incorrectly of course another silly mistake we can quickly fix that and we can rerun our code one more time I'll fast forward through it and we'll just make sure that this works all right so the code did indeed work and you can see here that when I WR scroll through kind of the results that we get a bunch of different uh kind of similarity searches or Sim Sim similarity search results sorry so it says alert message and homepage and then it spits out the whole body here hi there so as I'm following the video Blah Blah Blah this was the issue and then it gives us information like the author the comments uh and everything else that we specified here in the metadata so that's pretty much it in terms of doing the similarity search I just wanted to show you that that does work and how you can test it out manually by using this line of code right here we don't actually need this so I'm just going to comment it out and now what we're going to do is move on to actually writing the agent we're going to provide this Vector store so the one that we wrote right here as a tool to the agent and it can decide when it needs to use it so that's the really cool part about agents we just give them these different Tools in this case one of the tools is the retrieval augmented generation part so our Vector store database and if it needs to use it it will if it doesn't need to use it it won't we're also going to build another tool which is just a python function that will allow us to save a note okay so let's go and do that so since this is going to be a tool we need to write app this as a tool in order to do that we're going to say Retriever and did I spell that correctly I'm not really sure but I think that's fine will be the vor. asore retriever we're going to go here and say the search uncore quars are equal to and this is going to be K3 so it just returns three documents if we wanted to return more documents we can set this to five 10 whatever we want okay then we're going to say the retriever tool is equal to the create retriever tool which we imported earlier we're going to pass in here the Retriever and then we need to give some description to what this tool is so we're going to say GitHub undor search so what we're doing first is we're providing the name for the tool and then we're going to write a description of what the tool does and when the agent should use it so this is kind of how the tools work we provide what the tool actually is in this case it's a vector store database Lang chain will automatically kind of allow the agent to invoke this and start querying for different documents but the agent needs to know when to use it and in order to do that we have to provide it that data so we say okay the name is GitHub search and then the description is going to be whatever it is uh that we want to tell the agent that this tool is about so I'm just going to copy in the description to save us a little bit of time you can modify this and see if it works better for you but this was working well for me this says search for information about GitHub issues for any questions about GitHub issues you must use this tool okay so really straightforward I'm just telling you what the tool is and when it should use it and now that we have that tool what we'll do is start setting up our chain of kind of different prompts and return types and all the things we actually need in order to run this agent so now that we have this tool set up and by the way we're going to write another one in a second we need to have a special prompt that kind of tells our AI in this case it'll be chat GPT how to utilize these tools and how it should behave now rather than than writing our own prompt and trying to come up with one we can actually just download one from Lang chain Hub that's automatically configured to do everything that we want so we're going to say prompt is equal to hub. pull and bear with me while I write this out this is HW Chase 17 this is going to be slopen aai and then functions agent okay so this is a special prompt for openai or for chat GPT that tells it how it should behave and how it should utilize the various tools that we give it that's it this is just going to download it for us and automatically inject it into our code if you want to view how it works you can just go to this URL or you can just look this up on Lang chain Hub and you can see kind of what the prompt looks like now that we've done that we're going to create the llm that we'll use our various tools so we're going to say llm is equal to chat open aai we don't need to do any other configuration because in our environment variable we've provide the open AI API key okay now what we're going to do is we're going to say tools are equal to this is going to be a list because we can provide as many tools as we want in this case we'll have the retriever tool and in a second We'll add another tool and we're then going to make our agent so we're going to say the agent is equal to the create tool calling agent a function from Lang Lang chain sorry we're going to pass the llm we're going to pass the tools and we're going to pass the prompt that we want to provide to this agent and by the way you can customize this prompt if you want I'm just knock going to show that in this video okay so now that we have the agent we actually need something that allows us to execute the agent now that's called the agent executor so we're going to say agent executor is equal to agent executor which we imported above we're going to provide to that our agent which we just created and our tools which will be equal to the tools and then we're going to say verbose equals true now what this means is we're going to get all of the details on the thought process process of the agent so we can see what it's thinking and how it's selecting the tools if you don't want to see that and you just want the output then you can set this variable to false okay so we're almost done at this point now what we need to do is just write a loop where we utilize this agent executor so that we can continue to ask it various questions and then you'll really see this project come full circle so we're going to say w like this and I'm going to write some fancy python code I'm going to say question colon equal to and then this is going to be input and we're we going to say ask a question about GitHub issues okay and we're just going to specify Q to quit because if you type in Q then we're going to quit I'm going to say while all of this does not equal to Q because if you type in Q then we're going to quit then what I'm going to do is say my result is equal to the agent executor do invoke and I'm going to invoke this by passing in a python dictionary that has a key input and the value of question I'm then going to print the result and the output there's a few other Keys you can look at here but all we really care about is the output let me make this a little bit smaller just so we can quickly summarize what we've done before we run this code and then write our additional tool so you can see that we have tools equal to retriever tool we create the tool calling agent you don't really have to worry about exactly what's happening here this is just a function from langing chain that again takes the llm take the tools and takes the prompt and creates that agent for us we then need an Executor so that we can actually utilize the agent so again pass the agent pass the tools specify if we want the detail about the thought process or not then we write some code here we're saying while question colon equal to now this is what's known as The Walrus operator in Python and allows us to actually Define or declare a variable while we have it as part of a condition which is what we've done here so all we're doing is setting the variable question equal to the result of this input and then making sure it's not equal to Q because if it is Q we're going to quit then we take this question we pass it to the agent and we print out the result okay so before we test this let's write that last additional tool it's really simple it's going to be just a few lines of code so let's make a new file here and we're going to call this note dopy and all I'm going to show you how to do here is how you wrap a python function that can be literally anything you want as a tool that you can provide to your agent so we're going to say from l L chain dot sorry underscore core. tools we're going to import tool which is a decorator we're then going to say Define notore tool we're going to pass in a parameter called note okay and then we're going to write a doc string now the doc string here acts as the metadata or the description of the tool so make sure you write a good dock string that describes what this function actually does and how acts as a tool so I'm going to say saves a note to a local file and then you need to make sure you specify the arguments that this function takes in in this case it takes in one argument which is a note and we're going to say the text note to save okay so this is equivalent to what we wrote manually for the other tool and also the name of the tool will be the function name so make sure you give it a good name then we're going to decorate this we're going to say at tool what this does is convert this into something that can passed as a lang chain tool to our agent so take in how easy that is you write the function name you write a little bit description of how the function works as well as its different arguments so the agent knows how to call it and what it is and then you just write the function body on what you want it to do so in our case we're going to say with open we're going to open a file called notes.txt we're going to open it in a mode which stands for append we're going to open it as F we're just going to say f. write and we're going to write the note Plus the back sln which just means we're going to move to the next line that's it that's our note tool and we could write as many tools we want like this that our python functions that the agent has access to which I just think is really really cool okay last thing let's make a new file called notes.txt just so that that file exists so that when we try to open it in a pend mode we don't get an error and then we're going to go to main.py we're going to import this tool we're going to add it to the list and we're done so let's go to main let's go up to the top here we're going to say from note import the note uncore tool okay and we're going to go to our tool list it's as easy as this we're just going to add it into the list so we're going to say notore tool and now we can test out our agent so in under an hour we've been able to build this which I think is pretty impressive let's clear and let's run our code so Python 3 main.py and let's see what we get so we don't need to actually update the issues here because we updated the recently so I'm just going to go with no and then we should get a prompt popping up here saying hey ask us some questions okay so it does so I'm going to say hey can you answer or can you tell me about issues related to flashing messages okay so it's entering this you can see it's using the GitHub search with query flashing messages and it finds a bunch of different results here okay and then it gives me a summary of them I found some issues related to flashing messages you can see that it found two different documents here it says let me know if you need more information on these issues so you can say hey that's great thanks and you'll notice in this case it's not actually going to use any of our tools because it didn't need to so it just said you're welcome if you need any more help let us know that's the beauty of an agent it knows when to use the tools so now I can say something like hey can you save a note saying hello world so now what it should do is use the note tool which it is and you can see if we go now to our notes so let's go to notes.txt we have the note hello world now what we can also do is we can combine multiple tools together so we can say hey can you summarize a few issues from GitHub related to flask and then save them or save the summary as a new note okay so now it should use both of these tools so the first thing it's doing is querying for flask okay it found some issue here related to flask that's great that's what we're looking for okay now it's invoking the note tool and you can see that when we look here it says these are the issues or this is the issue that it found and it just saved a note related to it now this wasn't the best results in the world obviously we could fine-tune this and maybe make the prompt better but you get the idea the agent is working and we have now completed it so that's it guys that's how you build a complete AI agent that implements rag in under an hour that goes first and queries GitHub finds different issues and can really act as a code cing or AI assistant this is super cool obviously we just built something really simple in this video but you can extend this you can make it much more complicated and you can make it a lot more effective imagine if you were pulling GitHub issues pull requests maybe you had a code generation agent to and you combine those together you can make something amazing I hope you guys were excited about this video and enjoyed if you did make sure to leave a like subscribe to the channel big thank you to data stacks for sponsoring this and I will see you in the next one [Music] oh [Music]
Info
Channel: Tech With Tim
Views: 24,564
Rating: undefined out of 5
Keywords: tech with tim, ai agent tutorial, python coding ai, python ai projects, python programming, ai coding guide, advanced ai programming, advanced python ai, ai technology guide, creating ai with python, coding, programming, langchain, github, rag, rag ai, rag langchain, langchain rag app, retrieval augmented generation
Id: uN7X819DUlQ
Channel Id: undefined
Length: 48min 32sec (2912 seconds)
Published: Wed May 29 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.