ADVANCED Python AI Agent Tutorial - Using RAG

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
get ready because this video is going to get really cool really fast I'll be showing you how to build an artificial intelligence agent which can utilize a bunch of different tools that we provide it with that means we make the tools we give it to the AI and it will automatically decide the best tool to use this is super cool actually pretty easy to build and even if you're a beginner or intermediate programmer you should be able to follow along with that said let's get into a quick demo so in this demo you're going to see an agent who can answer questions a population and demographic data using something known as rag now rag is retrieval augmented generation and all this really means is that we're providing some extra data to the model so that it can go in reason based off that rather than it's old training data or something that might be out of date so in this instance I've got two data sources which we feeding to the model and then there's another kind of thing it can do which we'll talk about in a second so we have this population. CSV file now this is structured data and it's typically pretty easy for our model to ingest this and read it so this has pretty much all of the information about population density the change Etc and it will be able to kind of answer questions based on this CSV file we then have a PDF specifically about Canada now if we were building this project out we' put PDFs in for all of the different countries so we could get some more advanced information but just for demo purposes I've included one about Canada which is my home country now what this will allow the model to do is answer really specific questions based on information provided in this PDF now this is just the Wikipedia page for Canada what that means that the model can actually switch between using these different data sources or use both of them at the same time to give us the answer to the question that we need now we also have another functionality which is notes so at any point in time I can actually ask the agent to just take a note and it will go and save a note for me in this notes.txt file now this is pretty simple functionality but the idea is you can give this agent as many tools as you want want and it can automatically select the correct tool to use go ahead and utilize that which means you could tell the agent to call an API you can get it to do all kinds of advanced behavior and this video is really just going to scratch the surface but show you what's possible with this type of technology which is incredible so let's run this and have a look at how it works all right so I've ran the code here and I'm going to ask you the first question what countries have the highest population give me the top three now what's really interesting is we'll get to see the thoughts of the model here as it actually looks for the top three most populous countries so you can see that after a bit of thinking here it's gone through it's ran some operations it's actually use the different data sources that it has access to it's given us the top three most populous countries based on this data set which is China India and the United States now we can ask it a bunch of other questions as well and if we ask it to do something like take a note what it will actually do is save that information for us in our notes so if I go here to notes.txt you can see see now that it says the countries with the highest population China India and the United States pretty cool so that's kind of the agent capability right you can go outside and interact with other systems and tools that we give it now we can also ask get some specific information about Canada and you'll see when we do that that it will now go to the Canada data source instead so I'm going to ask it here what percent of Canadians speak English or French as their first language as a few typos there but not a big deal and now it's going to say okay we're going to use the Canada data set this time and it's going to go and find that information in there and then give us the result so it says approximately 98% of Canadians can speak either English or French as their first language which is correct based on that PDF data source this is super cool guys I have a lot of stuff to share with you in this video I'm going to show you how we build this exact application out and then by learning this you're going to see how you can extend this to use really any type of data source and have any kind of Aging capability now we are going to get into a step-by-step tutorial here but I want to give you some more information about what's going on on so you understand what we're about to build and the type of tooling we're going to be using now really what we need to do here for our agent to work well is we need to provide the correct information in a way that the llm can ingest it now in order to do that I've actually partnered up with llama index for this video now don't worry they're completely free they provide an open- Source package that allows you to actually ingest pretty much any different type of data whether it's structured or unstructured now just give you some information on why this is important really what we want to do with the agent is we want to give it our own data and we want to provide a set of tools to the agent that it can act upon right so it can go and query something from this tool it can go and save a note using this tool maybe it calls an API now we can write that on our own but it's a little bit more difficult and what llama index will do is provide us with a set of tools to allow us to ingest all different types of data so I'm on their landing page right now just because it's very quick to kind of see why we're actually going to use this free open- source tool as it says Unleash the Power of llms over your data you guys can obviously check it out from the link in the description and what it allows us to do is not just ingest the data but to index it as well and then gives us an interface where we can very easily query over that data which is exactly what our agent will be able to do we'll use these data sources to provide context for the answer it's going to give us you can see there's all kinds of apps you can build you know question and answer which is kind of similar to what we just did data augmented chat Bots knowledge agents structured analysis and what's great about this is that it works on unstructured structured and semi-structured data now typically it's going to be a lot easier to read structured data that's something that's in like a CSV file maybe it's in an Excel spreadsheet it has some known structure maybe rows and columns or it's in some format where we kind of know how the data is going to be structured whereas something like a PDF which is what we're going to be using for this example we have no idea how the data is going to be structured what it actually looks like and it makes makes it a lot more difficult for us to ingest it so with llama index as I'll show you in this video we can actually read unstructured data as well which really extends the capabilities of what we can do with our agent so what I want to do now is hop over to VSS code let's start going through some of the setup steps here and we're going to build out kind of the individual components of the agent you'll see how they work in isolation and then we'll combine them together to create that entire agent and you'll learn quickly how you can make your own agents that can do pretty much anything you want so let's get into into some of the setup steps here what we need to do is create a virtual environment we're going to install a few different python packages in there we're going to activate the environment and then we're going to get access to the data we need for this specific agent if you want to build a different agent totally fine you'll see quite easily how you can ingest different types of data and I'll talk about that when we get to this stage for now though we're going to open up a new folder which I have open in Visual Studio code feel free to use any editor you want and we're going to type the following command from our terminal or command prompt instance now this is going to be Python 3 hyphen M then V so this is our virtual environment and then we're going to say AI now you can call this anything that you want this should create a new virtual environment for us that's within this directory now what we want to do is activate this environment to activate it is a little bit different depending on your operating system for example if you're on Windows you would have just typed python hym venv aai python 3 might not work for you and then to activate if you're on Mac or Linux you can type source and then I believe this is AI SL bin SL activate and you should then see you get the prefix AI here whereas if you're on Windows you should be able to do SL AI SL bin SL activate and that should actually run this as an executable file if that doesn't work try doing this in a Powershell and that should actually activate it for you you should get access to this okay if you want to deactivate you can simply type the deactivate uh command I think I spelled that correctly and that'll deactivate the environment but from now we're just going to install the python packages we need so I'm going to type pip 3 install and we're going to install llama index again allowing us to ingest these different data sources and then we also install something we need for reading the PDFs so what we're going to install here is the following we're going to have llama index Pi PDF we're going to have Python d.v and we're going to have pan P these are the dependencies that we need obviously pandas is for reading in our CSV file Pi PDF for the PDF file llama index for setting up this whole agent and then python. EnV is for loading in some environment variable files which we'll look at in a second so go ahead and hit enter once this is finished I'll be back and then we'll continue with the rest of the video all right so all of this has been installed what we're going to do now is get the data sources that we'll need we're also going to get our open AI API key which we're going to use to be able to utilize the open AI models as a part of this project all right so I'm in my browser now what we're going to do is just download the different data that we need for this tutorial now the first one is just something I found on kagle you can download anything you want but this is the world population by country 2023 I'll leave this exact Link in the description it's free to download again from kagle next we are going to download the pdf version of the Wikipedia page of Canada now you can do this for any country you want in fact it's very easy to switch the country you could even do it for hundreds of countries but again for our purposes we'll just do a single one in order to do that you can simply go to tools here and you can go to download as PDF you can do this for any Wikipedia page you want again in our case we'll just do Canada now I've already got these downloaded but what we're going to want to do is place them inside of a folder called Data so create a new folder in the same environment uh where you have like your project open and then you're going to place those files you downloaded inside of here let me do that and I'll be right back all right so I've gone ahead and done done that and notice that I've renamed these files so I have canada.pdf and then population. CSV just make sure you name them something that's going to be easy to access from the code while we're here we'll also make a new file called notes.txt and this will store the different notes assuming that we spell the name correctly that we're going to store for this project and then we're going to make a new file outside of the directory so in the root kind of base directory called EnV and this is where we're going to store our open AI a API key now in order to actually interact with an AI model here we're going to be using open AI now I'll explain how this works in a second but for now we're going to say opencore aore API undor key uh and actually it's just going to be open AI as one word here is equal to and then this is where we're going to place the open AI API key so let's go now back to our browser so let me open up this page here and you see that we can go to platform. open a.com API keys I will leave this link in the description now I believe that in order to generate this API key you will have a credit card on file however there should be some free usage or you shouldn't actually be spending like any money by using this just a few times from your code what we do need is this openai API key so again go to openai sign into your account and then you're going to go to this URL we're going to click on create new secret key now from here we'll just give this a name I'm just going to call this uh something like llama cuz we're going to be using llama index and I will go ahead and create the key I'm going to copy this and paste this inside of my uh what do you call it environment variable file and then we should be good to go and we don't need to access the browser anymore all right so I've just pasted my open API key here inside of my environment variable file don't worry I will delete it after the tutorial so that you guys canot copy it uh but there we go we have it inside of here now that we've done that we're just going to create some files here and the first thing we're going to do is start looking at how we can query over pandas data so pandas is obviously a popular data science library and python allows us to read in structured data like CSV files so that's exactly what we're going to do we're going to read in our CSV file then I'm going to show you how we can actually query over top of it and ask questions based on that data source using kind of a streamlined or simple agent then we'll start adding extra data sources and you'll see how we work for PDF how we make not notes Etc all right so to get started here we're going to make a new file this is going to be main.py this is where we're going to write our code for now again we're going to work with that CSV file to start so first thing that we need to do is just activate this environment variable file what I mean by that is we just need to essentially load that in and to do that we can use the EnV Library so I'm going to say from EnV import load. EnV and then we're simply going to call the load. EnV file which will look for the presence of the EnV file and load in the environment variables we're also going to import OS and we're going to import pandas as PD we're going to use this to actually read in our CSV file there's a few other Imports we'll need but for now we can stick it at that and we can load in this data uh population. CSV so first thing we'll do is specify the path to our data so we'll say the population path is equal to os. path. jooy and we're going to join in the data directory with the population. CSV file which is the name of our file so now that we have that we're going to load this in with pandas so to do that we can say the population dataframe is equal to and this will be pd. read CSV and we're just going to read in the population path so now just to quickly test that this is working and by the way let's put this all beneath load CSV we can simply say print the population data frame. head Actually I don't even know if we need to print this but either way let's do this and we should see that we get some entries from our CSV file so let's go Python 3 main.py make sure you're activated by the way in the environment and there you go we can see that we get the first five entries or the head of our data frame meaning we're loading in the pandas data frame correctly so now that we've done this what we want to do is create something known as a query engine which is going to allow must ask specific questions about this data source so in order to do that we're going to say from llama index. query engine import and then there's one specifically for pandas there's all kinds of different ones by the way but pandas is obviously one we can do and now we're going to say the population unor query engine is equal to the pandas query engine let me just check my notes here we're going to have to pass in the data frame which is our population data frame and then we can specify this option which is verbose equal to true now when you do that it's just going to give you all of the thoughts and uh kind of some more verbose or in this case detailed output when we use this query engine now all this is doing is essentially wrapping over top of this data frame and giving us an interface to ask questions about this data using this kind of retrieval augmented generation system now there's all kinds of different ways or query engines you can create Sor with llama index obviously panda is just the one we're using right now all right so now that we've got the query engine defined there's a few things we can pass to this to make it work a little bit better and to optimize its performance now what we want to do is actually pass this kind of an instruction string that specifies what it should be doing and then we want to kind of give it a template on how the prompt should be handled when we actually start querying some information from it so what I'm going to do is I'm just going to copy in the prompt templates and the strings we're going to use because they're a little bit long and then you can find all of the this from the link in the description I'll have all of the codes you can simply copy it for yourself but I'm going to make a new file here called prompts dopy and I'm just going to paste in a little bit of code here now let's make it so we can actually read most of this so what I've done is said from llama index import The Prompt template and we've specified two things an instruction string and a new prompt now in the instruction string you can see that it's kind of telling this engine what it should be actually doing with our pandis data so it says convert the query to executable python code using pandas the final line of the code should be a python expression that can be called with the evaluation function the code should represent a solution to the query print only the expression do not quote the expression okay so it's telling it what it needs to do we then have a new prompt now a prompt template is something that we can specify where we can embed whatever it is that we type inside of the template to provide some more context to the model when it's actually performing this query so in our case it says you're working with a panda's data frame in Python the name of the data frame is DF this is the result of DF do head then we specify the data frame string and it says follow this instructions the instruction string and then the query string which is what we actually give it okay so this is just kind of templating what we want the actual prompt to look like making it easier for us as the user to just give a hum human readable kind of query that's quite a bit shorter so we put this inside a prompts Pi now we're going to import these strings and use them here with our population query engine so we're going to say from prompts import the new prompt and the instruction string and now inside of our population query engine we're going to say the instruction string is equal to the instruction string and to update the prompt it's slightly different we're going to say population query engine do update prompts and then we're going to pass a python dictionary and here we're going to say the pandas underscore prompt and this is going to be equal to the new prompt okay so we can zoom that in a little bit so now that we have all of this and we've kind of given it the context we've given it the prompt again if we go back here you can see we have the prompt template and the instruction string what we can do is actually give this a query and we can see if we get a result so I can say population query engine. query so for the prompt here I'll just paste in what is the population of Canada we'll save our file we'll rerun this ignore all of the output you're getting here here it's just because we're in verbose mode we can disable that if we want and you can see that we get the population here of 38 m781 291 okay so that's working we can query directly using this Source but what's really going to happen is our agent is going to be able to use this as a tool where it will get that output and then it will parse that output with anything else that it needs and then give us a more human readable response so this is how you load pandas data this is structured data it's a bit easier to read and you can see that it's pretty simple right we just use the pandis query engine now what I want to do is show you how we can actually interface with a tool so what we can do is have multiple tools right in this case the population engine as well as say our note engine so now we can get some information and then ask the agent to take a note of it so let's look at how we do that so what we're going to do for the note engine is we're going to make a new file just to keep this nice and organized and we're going to call this the note engine. Pi now inside of here we're just going to write a python on function that can be executed by our model we can make this as complex as we want in our case it'll just be a simple function it will take in a note and it will just save it to a file so what we're going to do is say from the Llama index dot I believe this is something like tools import the function tool now we're going to use this to kind of wrap our function and tell llama index hey this is a tool that the model can use so now we'll write the location to the file that we'll actually save a note to so we'll say the Note file F will be equal to os. path. jooy which reminds me I also need to import the OS module and we're going to join on the data directory and then we call this notes.txt now what we're going to do is make a function and the function will simply save a note so we're going to say Define savecore note this is going to take in the note that we want to save first we're going to say if not os. path. exists the Note file then what we want to do is create this file so we're going to say open the Note file in W mode which just means we're going to create a new empty file pretty much otherwise we're going to say with open and this is going to be the Note file and we're going to open this in a mode which stands for append mode we're going to open this as the file named F then what we're going to do is say f. right lines and we're going to pass to this an array or a list inside the list we're simply going to have the note and we're going to append to this the back sln character so that we go down to the next line so it's just kind of a simpler way to do this here we're saying f. right lines so we're writing in a pen mode this just means we start at the end of the file and then this is the single note that we want to write and we want to have the back SL end so it goes down to the next character then we're just going to return from this saying something like note saved the reason why we want to do this is because the llm will actually be able to look at the return value of the function so we could see if it was like successful or if it failed or something went wrong so we just want to give it something to indicate that hey this did indeed work so that it knows that the note was actually saved now you can return any type of data you want here and again like I'm just doing a really simple python function that saves a note but we could have functions that do some complex calculus we could have a function that goes on my computer and cleans up some files like any type of code you can write the llm can call that code for you right so I'm just showing you a simple example with the note but we can really Implement anything we want which is where I think this gets quite cool so now we've got the kind of tool that we want the llm to have access to what we need to do now is wrap this and kind of create an engine that the um agent will be able to use so we're going to say note engine is equal to function tool do from defaults and we're going to specify the function so we're going to say FN is equal to save note we're going to give this a name now the information we pass here can help the model understand what this tool does so so for the name I'm going to say this is my note saer okay and then we can pass a description and for the description we'll say this tool can save a text based note to a file for the user now you can write this more detailed if you want but you should just give it like a decent description and a decent name so that it actually kind of specifies what the tool does so that the model knows how to pick between the different tools okay so let me zoom out so we can read all of this pretty straightforward we're just defining a function again you could do any type of functions you want here what we're going to do now is bring this function in and we're going to start kind of specifying a pipeline here of different tools that the uh agent has access to and then it will choose which one it needs to use so we're going to say from the note engine import the note engine I think that will work fine and now we're going to start kind of creating this collection of tools all right so first of all there's a few things we need to import so I'm going to say from llama index . tools import the query engine tool and then there's another tool we need I got to look at it over here this is the tool metadata we're then going to say from llama index Dot and I believe this is going to be agent import the react agent we're then going to say from llama index Dot and in this case it is going to be llms we're going to import the openai llm okay so that should be most of the Imports that we need what we're going to do now is we're going to specify the different tools that we have access to so we're going to say tools is equal to and we're going to create a list now the first tool that we can use is simply the note engine so we just put this inside of here the next tool is going to be the population query engine so by the way let's remove this line here because we don't want to manually be querying this and what we need to do is wrap this in the query engine tool so it kind of specifies hey this query engine here we're going to add some instructions we're going to add some description to this you'll see how it works so we're just going to say query engine tool now for the query engine tool we're going to pass the query engine which is the population query engine we're then going to say the metadata is equal to the tool metadata again the other thing that we imported here and for the tool metadata similarly to the tool we just created we will give this a name and we will give this a description uh now for the name we will call this the population under _ data and we'll say this gives information about the world population and demo Graphics okay so now we have two tools that we can use the note engine and the query engine tool again just kind of a wrapper to allow us to provide some metadata here to this tool what we'll do now is we'll set up an agent which will have access to these tools and you'll see kind of how it works and how it can query this different data so we're going to say llm is equal to open Ai and we're going to pass the model that we want to use now the model is going to be GPT uh- 3.5 Das turbo and then I got to look at the one I was using 0613 I'm sure there's probably some newer versions but this is the one that I was using that was working well and now that we have the llm we're going to create an agent and the agent will have access to these tools so we're going to say the agent is the react agent and this is going to be Dot from tools and we're going to pass to this the tools so the tools are going to be the tools the llm is going to be the llm and the verbose is going to be equal to true just so we get some detailed information on the thoughts of the llm so we know what type of tool it's going to be using now the react agent from tools is just setting everything up for us so that we can pass in these individual tools and it will kind of tell the agent like you need to pick the best tool for the job it will then do that and we will be able to utilize the agent now another thing we can specify here is some context so we can say context is equal to and then we can pass any string we want and this can tell the agent beforehand what it is that it's supposed to be doing so it has some more information and well context about what it should do so we can specify a context string by going inside of prompts and we can say context is equal to and I'm just going to paste one in here again you can copy it from the link in the description all right so I pasted this in it says purpose the primary rule role of the agent is to assist users by providing accurate information about world population statistics and details about a country okay pretty straightforward but that is our context now if I go back to main.py I can import that from my prompts so we'll just go from prompts import that and then for the context we will specify the context now what we can do is set up a simple W Loop and we can just have the W Loop continually use the agent and ask it different prompts right and then the agent can utilize the tools and give us a response resp once so we're going to say wow and we're actually going to use the wallor operator here which is new in Python 3.9 so make sure you have that this is prompt callon equals to input and we're going to say enter a prompt and then Q to quit like that and I'm going to say while this is not equal to Q then we're going to go in here and we're going to use the agents we're going to say result is equal to agent. query and we're going to query the prompt and then we're going to print the result okay so quickly what this is doing is we're defining a variable prompt it's equal to whatever the user inputs if they type in Q we're simply going to quit otherwise we're going to specify the result is equal to querying the agent utilizing the prompt here and then we're going to print the result all right so we can run the code here and we can start by maybe asking it to save a note can you save a note for me saying I love Tim okay let's see if we can do that it says note saver I love Tim I can answer without using any more tools note saved successfully if we go to notes you can see it says I love Tim and then we can ask it what is the population of Vatican City let's see if it can find that for us just going to go look at the population data and I guess Vatican City is probably not in there hence why it's not able to find that for us let's say What is the population of India let's see if we can do that one and there you go it finds the population for us and it tells us in human readable form this is the population if you don't like all of this stuff being spit out you can simply go and remove ver Bose mode and then you won't see kind of all of the thoughts but I think this actually pretty interesting and it kind of proves how this is working all right so let's quit out of that by hitting q and now I want to actually start reading in that PDF data and kind of adding that to our tool set so that we can get some specific information about Canada and view how to read that unstructured data so let's make a new file here and let's call this PDF dopy now inside of here we're going to start reading in that PDF file now for the PDF file since this is unstructured we're going to read it in a little bit differently and we're going to use one of the readers that comes from llama index now you might be asking yourself okay well what's different about unstructured and structured well again with unstructured data we can't just kind of put this in a nice table right which is easy to access and easy to reference with something like Panda's code in our case what we're actually going to use is something known as a vector store index now the way the vector store index works is we're pretty much taking all of our data and we're creating something known as embeddings for it now embeddings are these kind of multi-dimensional objects or embeddings vectors whatever you want to call them and we can very quickly index them and query them in this database so we do that by kind of checking for similarity of intent and words it's a lot more complicated than I can really explain here in a few minutes I'd encourage you to look it up if you're more curious about how it works but in our case since we're working with this unstructured data we turn it into this Vector store index that contains all of our info and then we can go to that index with our query and we can very quickly retrieve the specific parts of this unstructured data that we're looking for to be able to answer the question hopefully that's kind of a simple enough explanation it's probably not 100% accurate but it gives you enough of an idea of what's going to be going on here here so what we're going to do inside of here is import a few things again we're going to say import OS and we're going to say from llama index import the storage context the vector store index and the load index from storage now the great thing here is that we don't need to keep recreating this index we're just going to make it one time and then once it's created we can just read from it hence why I'm importing a function like this load index from storage because it might already be created now we're also going to read or get sorry the PDF reader so I'm going to say from llama index. readers import the PDF reader now there is this kind of General reader in llama index that can read in like most types of data so it can read in text files it can read in um you know PDFs it can read in all different types of things but I want to show you how we can use a specific reader and how you can find a reader that you may want for the different type of data that you have so so what I'll do is just quickly open up this document here this web page and show you that there is something called llama Hub which contains kind of a directory of all of the different readers you can have access to so I'll link this in the description but when I found the PDF reader the way I found that was by going to llah HUB and then just typing in here PDF and then it showed me file PDF so I clicked on this and realized okay there's this PDF reader I can use and then I found that it was actually included by default in llama index so I just went ahead and used it but if we go back here right you'll see that there's all kinds of ones unstructured markdown Google Docs PowerPoint mongodb pretty much anything you want to read in it's already here so you don't need to manually write that integration now I will show you that there is something more General called the simple directory reader now this is something that just reads an entire directory and it can handle all different types of data including a PDF I didn't want to use that one for this tutorial but I just want to show you that it does exist and it can read in again like pretty much any different type of data and then for some reason it can't handle the data you have then you can go to llah HUB and you can probably find an integration that's already been built anyways back inside of our code here what we're going to do now again is get access to this Canada PDF file we're going to read it in using the PDF reader and then we're going to create that Vector store index so we're going to say the PDF path is equal to os. path. jooy and this is going to be data and canada.pdf we're now going to use the PDF reader so I'm going to say the Canada PDF is equal to the PDF reader we're going to create an instance of that and then this is going to be dot load data and we're going to load in the file equal to the PDF path now we have access uh to the loaded in PDF file and now what we're going to do is create a function that will actually get this Vector store index for us so we're going to say Define get index we're going to take in some data and name of our index and what we're going to do is we're going to start by checking if the index exists if it does uh then we can simply grab it otherwise we need to create it so we're going to say here index equals none we're then going to say if not os. path. exist the index name which is what we're going to store it under we need to create it otherwise we can load it and then eventually we can just return the index now fortunately we don't need to be math geniuses to do this we could just use llama index to actually create the index so we're just going to print building index and then we'll specify the index name then we're going to say the index is equal to the vector store index and this is going to be Dot from documents we're going to say data and then we're going to say show progress is equal to true so what this is doing is when we load in the PDF radar we're getting something known as a document we're actually going to add a bunch of different documents we're going to pass this as our data we're then going to create an index from that data so that's what we're doing here and we're just showing that progress as we do it after we create that we want to store it so we're going to say index. storage context. persist and we're going to say the persistent directory is equal to the index name and this is just going to go ahead and save the index for us in a folder then what we'll do here so if we uh already have the index we'll load it we'll say index is equal to and this is going to be load index from storage and we're going to pass the storage context. from defaults and this is going to be the persist directory at the index name now if you're wondering Tim how the heck do you know how to type all this stuff well I just found it from the Llama index documentation you guys can reference that as well but obviously this video is to help you get through this a little bit faster because this did take me you know a bit longer when I was trying to go through all the documentation and anyways we say index equals To None okay now what we're doing does the index exist if it doesn't let's go ahead and build it out and then save it if it did exist we can simply load this from Storage so we say okay well where is it it's in this location so let's load it from storage and then return the index so now we need to actually call this function so we'll say the Canada index is equal to and this will be get index and then we're going to pass the Canada PDF so this is going to have all of our documents and our data and then we can give this a name and I'm just going to call this Canada okay last thing we need to do is create an engine so we're going to say Canada engine is equal to the Canada index. asore query engine now what this does is use this Vector store index as a query engine just like we add the query engine for our population data now that we have this inside of here it's an additional tool that we can use with our agent so again we could say you know Canada engine Dot and then we can query this because we have the query interface available and it would give us some result but now we already know how that works so what we can do is we can go back to mupy and we can import it so we can say from what do we call this the PDF import and this is the Canada engine and now what we did before right with our query engine tool we can copy this paste it and we can now specify that we want to have the uh what is this the Canada engine and for the tool metadata we're going to say that this is Canada data and then we can say this gives detailed information about Canada the country okay so now we've added another tool to our tool set that the agent can use and that's it that's all we need to do that actually finishes the entire project we can now run the code and make sure all of this is working so let's run this here and we want to give it a query that um will kind of force it to use this Canada data set right so first of all it says okay building index for Canada so it's going to parse the nodes create the embeddings notice that it creates a new directory for us with Canada so notice here we've got all these different files right containing information about the vector store index they don't mean anything to me they probably don't mean anything to you but they're here and now what we can do is ask some questions so I can say something like tell me about the language is spoken in Canada and let's see what it gives us here says okay I need to use some data source I'm going to use the Canada data input languages and then it gives us this kind of big blurp here and it says in Canada the two main languages are English and French approximately 98% of Canadians can speak either or both English and French English spoken by 50% 7% of Canadians blah blah blah and then we can ask it more and more questions now we can say save a note of that I'm going to hope that this one works and for some reason it's not using the previous answer in the note um but what I can do is say you know let's ask it the same thing tell me about the languages in Canada and save a note of that okay so let's see if it works this time let's give it a sec so there you go you can see it actually ended up using both of the tools here it says I can use the Canada data tool and then use the node saer tool to save a notab kind of cool that it deduces that and we get languages and that says okay here's a note boom spits out the information here as a note and we can obviously check that in notes.txt and you see that we get that information stored as a note so I think guys with that said that's going to wrap up the coding component of this video all of this will be linked in the description in case you got lost or you want to copy any of this and it's really interesting to me what's actually possible here with agent-based AI being able to pass these different tools data sets functions to an llm and letting it read and using that is super interesting to me and that allows you as the human to have a little bit more control but to let the agent kind of you know make the decision on what tool it should be using and what it actually needs to do I already can imagine a ton of different great applications that I could build using this type of technology that wouldn't be that overly complicated because of how easy it was here to use llama index for the query engines and even for just wrapping python functions right like there's probably a 100 different python functions I could write that would be super cool that this agent can then go and execute and use and I could probably create an entire application quite a bit faster just by making all these individual tools wrapping it with this llm and saying okay agent you know go use the tool based on whatever the response is anyways I'm going to wrap it up here if you guys enjoyed make sure you leave a like you can check out all the documentation and code from the link in the description and I look forward to seeing you in another YouTube [Music] video m
Info
Channel: Tech With Tim
Views: 24,752
Rating: undefined out of 5
Keywords: tech with tim, Advanced Python AI, AI Agent Tutorial, Tech with Tim Python, RAG AI Model, Python AI Projects, Python Programming, AI Development, Machine Learning Python, RAG Agent Creation, Artificial Intelligence Tutorial, Python Coding AI, Python RAG Tutorial, Tech with Tim AI, Advanced AI Programming, Python Machine Learning, Creating AI with Python, AI Technology Guide, Python AI Techniques, RAG Model Tutorial, AI Coding Guide
Id: ul0QsodYct4
Channel Id: undefined
Length: 40min 59sec (2459 seconds)
Published: Wed Feb 14 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.