Development with Large Language Models Tutorial – OpenAI, Langchain, Agents, Chroma

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

welcome to this course about development with large language models or llms throughout this course you will complete Hands-On projects that will help you learn how to harness the immense potential of LMS for your own projects you'll build projects with LNS that will enable you to create Dynamic interfaces interact with vast amounts of Text data and even Empower LMS with the capability to browse the internet for research papers you'll learn about the intricate workings of LMS from their historical Origins to the algorithms that power models like GPT option is a passionate educator in the field and he teaches this course hi welcome to this course on llm engineering and development brought to you by loop.ai my name is akshat and I'm excited to be teaching you guys this course so who should watch this anyone who's interested to learn Hands-On llam usage in theory through explanations and multiple guide projects paired with a fairly basic Python Programming knowledge should be pretty comfortable following along so here are all the projects that we're going to be working towards we're going to be able to create a clone of the chat gbd user interface along with the large language model that will help us interact with it with custom personas we're going to be able to have conversation so with our documents like text files and PDF files uh that chair gbd may have not been trained on we're going to be able to use Asians which are self-prompting large language models and enable these large language models to browse the web and research and literary research papers using the archive API we're going to be able to enable these large language models to use more than five tools in the real world along with equipping them with their own custom tools so here's all the course content and let's get started with the basic introduction to other guns so an llm basically happens when you combine a massive neural network with huge amounts of data and train it on this huge amounts of data once it's been trained you then align it to human values in an attempt to create a reasoning engine so examples of these LMS are bird llama more famously gbt 3.5 and gpt4 so the concept of alums has been around since 1996 but why have they only recently gained traction the reason why is because this is the first time in history where llms have been actually able to outperform human reasoning in certain contexts and the reason for this is due to a huge import improvements in performance and scale now more on scale uh as you can see modern llms like gbd 3.5 have a huge amount of money and skill that goes into developing them GP 3.5 has over 175 billion parameters and you can think of parameters as neurons in your brain along with this it has huge amounts of training data GPT 3.5 is not where this cycle of LM development stops because open air has another model called gpt4 that is essentially an upgraded version of GPT 2.5 that performs all the data GPT 3.5 can but just seems to meet the goal a lot better the reason for this is due to its huge amount of parameters 1.76 trillion and with this it has the capacity to undergo training with even greater amounts of training data a huge model like this that's trained on a huge amount of training data is pretty dangerous to humanity so the demand for aligning this model to human values and feedback is also very important and plays a very important role apart from GB 3.5 and gpd4 by openai there are several competitors in this landscape like Microsoft Facebook Google who are all actively publishing resubscribers and breakthroughs pretty much every week at the time when this is recorded so now let's look through some of the algorithmic breakthroughs that got us to this point I'm going to go through these by explaining to you the typical architecture that you would follow uh if you want to train your own custom large language model so let's start with choosing the architecture and tokens so a large language model you can think of is basically a mathematical function that has just learned to predict the right words given the given some input context and if you want to use this as a mathematical function you'd obviously have to deal in numbers so we use something called tokenization that converts a string of text into a vector of numbers and so the tokenizer splits these words into individual elements and then assigns it a unique number so once you assign the unique number uh you also need to tell the llm when to actually stop generating or otherwise it'll be stuck in an input too and that's why you have this stop token here so now let's look at the brain which is actually what learns these relationships between words and predicts the right word so you don't need to really know all the complex math that goes in that gets involved in making this neural network for this course but I'm just going to give you a quick intuition as to how this works so let's say one of this input sequence like this word here gets inputted you can think of all these layers as random numbers initially and what you can think of them doing is just multiplying them multiplying this input sequence uh yeah these many times and then you would get an output so obviously if all these layers are set to random numbers you're gonna have a pretty bad and random guess so what we do after we call that the output is we look at what word we actually want the model to output and then we compute a difference between our output value that we predicted and the the actual value that we need once we have that the algorithm takes node and it adjusts all of these parameters here uh to step in that direction so once that's done we've pretty much finished one training step and and adding to this overall scale concept this whole training process occurs hundreds of billions of times so you can imagine that even after the first 10 million iterations it's going to get pretty good at guessing what the next word is supposed to be so now more on the training problem I was just talking about the way these large language models are trained is through a next word token prediction task which basically gives the model the question as well as a blank that it's supposed to fill in so here obviously the answer is books so it's given all this context here surrounding context and it's asked to predict the word so these these are just four questions here but the model gets inputted billions of questions from data uh from code to college textbooks to articles to lyrics to podcasts and pretty much any data that you can scrape off the internet so good so now we've effectively trained this model that can predict the next token in our sequence but this is still very limiting because it's just one token we want we want our slm to ideally express ideas and thoughts and actually reason so the way we do that is by actually predicting this next token here and then inputting the predicted token back into the model so that it can predict again and you just keep you know collecting these predictions and you'll get a string of text and um going back to the stop token this is where it's important because once the model is finished with the thought it needs to be informed that um yeah you should just stop the thought so great so now we have this huge llm that's been trained on a bunch of different data and we can use it as a reasoning engine but maybe the llm doesn't have the knowledge that it needs in order to work in your specific use case and this is where fine tuning comes in fine tuning um find you can use fine tuning to further train a model to your own personal context and this is good because you don't need that much data anymore you just need a small label Corpus of your example data and examples of something that you can use fine tuning for is generating custom mid Journey or image generation prompts for another model as well as letting it learn information Beyond its knowledge cut off so you know say um you wanted to teach it about the latest cancer research so a quick note here uh the second example where you're actually teaching it information it's not the most efficient uh when you find you we're going to be using a much better method called Vector databases in this course uh more on that later but generally you'd fine tune if you want to change its Behavior like this example here so now that we have this model with its optional fine tuning you can further take this to production and make it safe so that it doesn't um spear harmful content through RL HF which is reinforcement learning from Human feedback the reason we do this is because the the training data that we scrape comes from a bunch of different sources like articles podcasts textbooks lyrics Etc and this data is bound to have a lot of bias in it and alignment with human values basically fine tunes it to remove the bias so here's um here's what they do what open AI does to fix this issue so basically it's uh it uses a human labeler to just get safe output and reinforce the model to Output these safe tokens okay so now we pretty much went through the entire pipeline which is everything from training to fine tuning to aligning to human values and now we're gonna actually look into how to get the output from our models so the the question that you asked the process of asking a large language model a question is called prompting and the model generates something called a completion which is just a string of text that's likely to complete your previous text inference parameters are another technique that you can use to enhance the creativeness of the model and we'll be trying this out in the chat gbt playground but these are the four parameters you can change so these two basically make your outputs more random by changing a lot of things like the window for the probable words to how much these low probability words are weighted add frequency parity and presence parity are the other two inference parameters and they basically make sure that your model doesn't output the same answer in the same style for every single for every single identical question that you ask it so now let's go ahead and try this out in the chat GPD player so quick ignore from this future uh if you already know the basics of chat GPT and calling the API You Should Skip the next two sections and come to the third section where we'll be learning how to clone the entire chat GPT user interface so now let's explore the playground environment so in your browser just type in chat GPT breakdown and you should be prompted with this user interface here another way you could do this is just through the open AI website itself where you just log in and done okay and just hit API here and just hit play dot Okay so the reason why we're choosing to use the playground over the usual charge gbt user interface is because the playground will give us a more customizability as well as a better feel for the actual API that we're going to be using throughout this course so let's get started from um left to right in this column here you can add in your system message uh like and say you are a programmer something like that and basically the system message assigns the llm's personality throughout the entire conversation so here you can maybe say something like our high could end out the first 10 and date sheet so it pull it okay defense and it should give you some Outlet okay so as you can see it does give us the output and one thing to notice here is that there's always an alternation between the user and the assistant and so in the API as well when we script some kind of conversation we're going to be using this so just remember that it's only valid it it's user followed by assistant so now let's move on to this main column here that is going to let us customize a bunch of different parameters so uh it's generally recommended to keep the mode in chat but you can explore the models here and we have options between the versions of the GPD 3.5 line and the gpt4 line and basically the difference between GT4 and gdt 3.5 is uh gpt4 is slower but it's a lot better at logical reasoning and creativity related tasks because it's just trained on so much data and gbt 3.5 is obviously the opposite is faster but it's it's a little worse at uh you know these tasks and so there's a speedo there's a trade-off between intelligence and speech here but in general I would recommend using GPT code so we can just give it another message maybe I could write a poem and yeah so as you can see uh the previous suffers so much faster than how it's outputting now and this process of like stringing together these texts live is called streaming where you're not uh you're not just presenting uh the whole text after it's all done processing you're just doing it like it is putting it together so now once we have this we can actually change the other parameters here these in press parameters so temperature is a measure of creativity as I've mentioned before so when you put temperature at zero it's very deterministic so it's useful when you're trying to just try to understand some kind of data and when temperature equals one it's a lot more creative so you can you know use it for a bunch of different uh poem writing or just just like arguments in general maximum length is pretty obvious and so just we can uh just test out the deterministic part of this by comparing to outputs so we can write direct code for a lot thing sorting and dirt from scratch so I'll just copy this and uh submit this and while that's executing uh I'll just go over like topi is another example well that allows us to control the modern's creativity and it's only advice to you is one of these parameters uh at a time uh either temperature or top B so moving on we have frequency penalty and presence polity and if you want the model to Output a different answer for every same question or just cut down the number of repeated words in an answer need to use this frequency parity so anyway let's just so we did this when temperature equals zero so now let's compare this output to run temperature equals uh something higher like well so while that's happening uh as I said before frequency penalty uh basically penalizes based on how often a word occurs by repeated verticals and presence penalty penalizes the model based on uh just whether or not the word exists so let's just compare these two outputs with different temperatures so here the algorithm is using bubble sort but here the algorithm is using the home selection sort so as you can see the although the answers will be correct the way it approaches these answers is going to be widely different based on the temperature or the top B so another quick thing that we can do here that makes our life a lot easier is when we want to use this in the code which we'll get to in the next session you can just hit this view code button it'll give us everything that we actually need to get started so all we have to do is just copy this code and I'll see you in the next video where we're actually gonna do work through this API in our YouTube channel you should find this notebook Linked In the description below and this is something this is called a collab notebook which is just an online environment that helps us run our python code so to get started all you have to do is just press shift and enter and it you can just go through all this collab notebook so we'll just wait for all the necessary python packages to install and these packages basically let us call the open AI API and actually access charge apt so let's just wait for that to finish and let's go through what this open ebi game means middle black or not okay so now let's move on to this API key and so what is an API key since open AI builds you on using the API you'll need a password so that no one can get access to this but you and that's what the API key is it's just a password to your API so obviously I'm going to be revealing this password to you guys but I'm going to deactivate it as soon as this tutorial ends so I'm going to show you guys how to do all of that so you don't have to worry even if someone gets access to this key you can just disable it so what you would do is you would go out to your uh you know this account and you would hit view API keys and once you're in this dashboard you can just create a new secret key you can call it anything so for this one I'm probably just gonna follow course Radio 2 for one and create the secret I'm going to copy this now and once it's copied I can just yeah I can just paste it in here so this will be a password that's unique to you and say you've accidentally revealed this what you can do is you can actually just hit the trash icon here called edible key and it basically disables the gear so you can't access it through this key anymore I'm not gonna disable this but I've made a couple of ones here that I'm going to disable and yeah I'm going to be disabling this API key as soon as this video ends so let's just shift and enter and that runs this code so your environment variable is now opening Aid and this is just an example of how we call the epr I'm not gonna use this as the example I'm gonna go back here and go to the playground and use this uh example that open air has provided so in this case let's just hit new chord and then quality yeah let's put there okay so we can remove this open ABI Cube if there's already defined that up here and now this response body in this content here which is this part here we can just say what is your name and I'll show you guys the other games after we're done with this so as you can see it does run but it doesn't just return us with the response so what I can do is I can just uh the whole value is um in shorting this response variable so I can just print out response and as you can see it gives us a bunch of the state up here and the only thing we're actually interested in is this part here right it says digital assistant so notice how I decided it all equals you through here but here it's all equal to assistant the content is I am in the AI language model developed by open AI I don't have a personal name whatever so obviously this this is the only thing that we want the model the output and everything here is just uh something that we shouldn't be you know using so what we can do is we can just extract just to reply to this and so this goes through the entire Json data and it just gets us the content here so it's the same thing here let's run up here so another thing you can do is again just um everything that you do in this playground you can do here so you can put on system message here and the way you do that is just by defining this other dictionary and you put the row a system and the content does whatever the system as it is so here you can see your helpful assistant um you know obsess like potatoes and shift enter again it should take a little while but we should be clicking on the output pretty soon yeah so here as you can see it does assign that personality to it because you know it's just potatoes but it also just completes the task as well another thing that you can notice in this example is I have the power to change everything that I've changed in this pretty loud so I have the bar to change the model the messages the temperature and if you come up here you can actually see that everything that I changed up there you can sort of just assign a general variable and change it I'm not going to be doing that because I don't think it's very necessary for this tutorial but we can look to some more prompting now so one thing to note is that GPD 3.5 doesn't really pay attention to the system message as much so generally whenever you want to assign a system message you should be deported so let's go to this example here which is few short prompted so if you guys remember what I said about fine tuning in the previous in the slides explanation this is sort of analogous to that when you're sort of giving the model example answers you're scripting this entire conversation here so you're saying that the system method is this the user has inputted this and the assistant has responded like this so this didn't actually happen but you're saying that this is like the ideal uh response that the assistant should be giving you and once you give it lineup examples it sort of learns uh to you know output the answer based on how the user desires it so here let's just change the model to gpd4 and we can run this code there yeah so as you can see it does give us this output we can try this well we can track this is sort of a simple example but we can try this maybe Maybe if so I'll just uh what liquid so as you can see like we can compare the output that it's given previously with the few short prompting and now and I would say like the output before was a lot more concise and in-farm than whatever this is so again another example of things short prompting where we can actually just assign the system message as something that follows pattern so that it it can transistor Mom it enhances whatever it learns so I highly recommend you guys to just go into this notebook it is just your API Key by connecting with your car and just simulating a bunch of conversations and tweaking a lot of these parameters here and so I'll get a feel for it because there are no ideal set of parameters for every um for every problem much like there is no ideal dump for every problem it's a process that comes from just uh you know keep or driving error and just trying out a bunch of different solutions so one thing I didn't mention so far is that like this API is actually built so and the way open AI charges you for this for using this API is through the number of tokens so you can think of a token as I've explained before as three-fourths of a word approximately and so for every three-fourths of a word every thousandth reports of a word you would be charged differently based on what model you're using so this took this Library here um sort of helps us understand how much we're using it but we are a much better way to do this is just going through uh is going through your manage account and just looking through uh how much you're being built because openm provides you with all the data and accessory we're here is just a way to do this programmatically um you can just this is just something where you just copy paste the code in and it helps you with that there's really no bonded understanding what this means okay so as you can see it's counted 129 prompt tokens so you can just do the lab and see how much that charges so in general the high-end models like gpd4 are slightly more expensive than gbt 3.5 but I would say the price is um pretty minor it's like 0.03 dollars or something or a thousand tokens per one of the models so that was it for using the API uh hopefully I've given you guys a much better intuition on how this works so uh one last thing that we can do is actually the finding our prompt through error Labs so as you can see all of this was just asking it a question and what we continue to sort of help with asking better questions is asking the model itself to make a better question so what we can do is like uh can you trace this question to be more so here consists and to the point and then you would just ask it like what is python or that's pretty concise already maybe we could do like um if I had three apples and my brothers father Eric or how many do in total and so we can just you know run that thank you yeah so it makes it a lot about concise and you can use this as your prompt instead of your other prompt and the reason why this is helpful because is because you can cut down the token size and so you know open AI charges you less we're using more concise plugs so you can just say reduce this and hit submit yeah so this is just a match token so uh over time if you keep calling some prompt you can just optimize it through charge GPD so so that was it for prompting now let's go into actually applying this into some projects my first project will be cloning the entire chart GPT UI and assigning it a custom personality so that we can interact with uh you know custom characters let's unveil it users we're gonna be using the chains that package in order to make our chat GPT column so chain lit is basically this framework that allows us to build user interfaces really easy if you have had experience with a package like streamlit you can imagine chain that is like streamlit but for a large language model applications so comes with a bunch of different features and unique Integrations and ideally uh this is what our end goal is going to be as shown by this video so just skip through yeah so as you can see it's a pretty pretty feature-itch user interface and now let's get started building this okay so now let's actually get started with uh cloning this user interface the first thing that we're going to want to do is import our change that package and the way we install it before we import is we go to the terminal and ignore everything that's happening here all we need is uh just a python environment and you will do this command so bip install chain lid and we're doing this in the terminal tab so once you do that it should install all your packages and if you haven't already you'd want to do pip install openai as well and so this basically installs all the packages from external source that you can use to you know work with chain bit and all these features okay so now let's start our first goal will be the created user interface that just outputs everything that the user inputs let's start by doing defining a tag so so on message you would Define a function uh main this would be as asynchronous don't worry about async because that just means that it's going to wait for the user to send the message so it start executed immediately so basically this function main takes in uh the parameter of message which is supposed to be this is going to be a mapped to a string so this message is going to be whatever the user inputs and all we're going to do is we're going to say await chain that scl CL dot uh message and then we're going to send this and right now what it's going to do is it's just gonna return an empty object so what we have to do is we have to actually uh specify what we want in the contract so in the content here we can uh yeah so in the content we can just say message addressing message because that's what the user inputs and I think yeah so this is about it for our basic example what we can now do to run this and actually see our user interface is in your terminal again uh you have to learn to get really familiar with your terminal because it can be really useful for the upcoming projects so go to terminal and just run chain later on and look at your bicon file's name minus main.py here so I'm just going to put main.py and then I'm just gonna use the W flag so you should really remember that um you know you should remember that you're not supposed to run a chain with program with your run but run button so I'll make this more clear uh after we run this command okay so as you can see it's running on localhost Port it calls it is just just just so that we have complete Clarity we will be using the screen button to drop something for code but this executes it in a server and chain that helps us you know deal with the server and all of the back end so all we have to do is just run that commit and now if we just put it in high uh you know whatever uh whatever the user inputs the chatbot will out so for some of you running this for the first time it might not actually look like this you might have a bunch of random text that's appearing there like welcome to chain lit and the reason why that's happening is because of the chain the dot MD file so here's just a sneak peek up where we're going to be covering in the future but for now go to this channel.md file as you can see mine is empty but yours might be just full of a bunch of text and all you have to do is that none of the text here is imported so you can just you know Ctrl a and delete all of it or you can add something like welcome to this interface Begin by sending uh wait that's it or something like that so once you do this hit command s which saves it as you can see because of our W flag that we put here uh the server actually watches for changes so one as soon as we hit the Ctrl save it says file modified change the dot MD preloading app and it should be reloaded yeah okay so it as you can see it says welcome to this interface Begin by sending a map subtract there's a CFI and it goes away but as you can see it just outputs everything that we have a couple of more options that we have in this user interface is hiding you know chain of part and expanding these messages but more broadly you know you can double between dark more in light mode I'm gonna stick to dark mode for this tutorial okay so now we pretty much have you know the basis and the user interface all we have to do now is pass the message into into the API and we should and then you know just dot send the answer right so let's let's do that now and this is going to be pretty easy because we've already done this uh bunch of times in the collab notebook what we're gonna do is just you know wait for okay so here all we're gonna do is maybe just make something called response here and in this response object we're just going to put uh the chat completion and dot create and in this we're gonna put our model our uh we're just gonna leave that empty for now and our messages again MD for now and maybe your temperature so why not come for a shirt equals something so you you need to remember to put commas after all of them except for the last one and commas between the messages as well so the model can be anything you want another gbd4 and the messages is an array of dictionaries so in the dictionaries you'll need to pass two parameters you'll need a role and you'll need the contact and I'll explain what that is in just a second so again roll or whatever that is and then the corn bat a temperature you can decide to build you know whatever you want between zero and one or two and once we're done with this we gain a tourism singer yeah it should be fun Okay so so since so in our role uh key value pair we can just put this as the assist step and here we're just gonna put it as the user so as you can see here basically what happens is that this is the assistant message as you've seen in the chat GPT playground and this is saying that uh you you're like asking the aggression basically so in the content for the system message you can put you are uh helpful assistant and in the in the user message we're actually going to pass in our message variable that's passed on by the user and once we have this we instead of you know content equals whatever the user messages we're just gonna return the response from childcpd Okay so now let's run this again same command in terminal okay so it's saying uh no idea keep provided let me provide will make the idea and so maybe uh we can just we'll store it started and then we can just put this in so and skip it save that hey then just say just something I'm going to draw it here and I'm not quite sure we're going to drop so let's try to debug this maybe uh okay so maybe what we can try doing is just try this response into a string and hit save let's let's see if this works okay perfect so yeah um basically the mistake I made was that this was just returning just an object a Json object and the content on the uh support strings so as you can see in this object again uh let me just you know redo that okay so here to my message hi it says content is hello how can I assist you today and obviously we don't want any of this other stuff around it and as I've shown you guys in the collab tutorial all we're gonna do is just uh remove these brackets and then we're gonna index our way through to get our message so we're gonna go to choices and then inside choices we're gonna pick the zero the element and then the zero element we're gonna go to the message and then send message where it went to go to content okay so all of this is going to be wrap in an app string so I'm just gonna put this like such and then you just remove this and then you put this here Okay cool so once we did once we do this let's just put a cover here where say I'm not 60 and then Mom they're saved about so now let's try running it again oh yeah okay so it works you can uh uh you know tell me a short one like story yep so as you can see we've successfully cloned the chat GPD user interface and we can also uh call our API through this and modify any of our parameters here to suit the content now what we're going to do is something interesting where we can sort of talk to the model um that is assigned a specific personality so what what I mean by that exactly is we're going to be able to change the user message to something that changes its personality so here you can say like you are an assistant that is obsessed with power rabies Legolas all right so once you do that you can just hit Ctrl save it should modify a bunch of times and you know you can just start by sending a message so Phi and once we wait for a while it should return us the output so yeah it it follows the system message and it really pays attention to it because as you can see here you know it's obsessed with legless so this is how you can assign uh personality to it and this is how uh if you've ever gone to GitHub training repositories there's a bunch of these uh you know different different uh repositories that all they do is just you know change the system message to make it behave a certain way and that's exactly how you're able to get this Behavior without fine dma so next let's address some limitations of this approach so we have successfully built uh working chat gbd clone but there are a couple of limitations to this regarding the user experience so there's no streaming involved and streaming is basically this live uh token prediction of outputs that happens instead of just processing the output all at once and passing it directly there's no generating the messages um Mass uh you know button or message that basically helps the user actually identify it the backend process is running because otherwise you'll just have to wait without any confirmation on whether or not anything is actually happening on the back end and additionally after that there is no back-end context and by that I mean the user doesn't know what kind of llm or what is actually running in the background this third feature is pretty optional but it's uh very useful when it comes to debugging when we go into more complex large language model systems so let's see how we ideally wanted to look like this is from the official chain that website as you can see there is streaming here you know these tokens are being spit out live there is this background step that I talked about and it reaches to rewind the video again uh you can see that that stop task button there allows us to actually see whether or not this llm is actually generating and we are able to stop the flow so how do we get here the answer to that is through Nang chain so line chain is the most popular python library that helps us deal with error ramps it has some pretty Advanced functionality uh you know excluding whatever we just talked about here and we're going to be using this extensively in later parts of our course when we're going to web browsing agents and using our tools with agents so anyway let's get into our land check implementation now great so now let's look into how to actually integrate Lang chain so that we get access to all of those user-friendly features this is what we're going to be working with now uh I'm just gonna you know stop that app that we just ran for running and we're gonna pip install and this is very important U Lang chin and what this EU flag does is it basically installs the very latest version of blank chip and the reason why that's important is because line chain is a library that gets updated pretty much you know every week or so so anything that works today might be depreciated or discontinued next week and in order to you know keep all your dependencies in check we're going to be having to you know update our land chain so we I just defined a random string here so this template just gonna allow the llm to think step by step as it does here and the first feature I'm going to show you from Land chain is prompt template so if you guys have ever worked with the python format function this is pretty much the same thing what I mean by this is uh this is just vanilla python so I'm just gonna do uh template dot format and then inside this question equals uh you know whatever you want like what is one two three four whatever so let's just I'll happen on this okay my bad um you're supposed to print this up okay so let's just print this up yep so what it does is it takes this template variable and then it formats it to one two three four but how's that actually happening well these curly braces indicate that this is an object that needs to be formatted so everything inside this curly braces is going to be replaced by uh what is one two three four and that's exactly what prompt template does but it makes it a lot easier and a lot more llm friendly so we're gonna get right ahead with the line chain and change that implementation we're going to just Define two tags here we're gonna do CL dot on message as well and once we do this we are going to yeah so we are just going to create a main chat so we're gonna just Define our main function here and inside this we're gonna do our prompt equals something our llm chain equals something and then CL dot user session dot set all right llm shin tool as an M chain so and chain you can set that to llm shin okay so now that we have that let's actually uh go into what we're doing here so what on chat star does is as soon as this object of you know the chain lit UI is deployed here are the variables that we need to initiate so I'll just um yeah I'll just create the prompt here so prompt template is the object and set this we're gonna put our template as template and it takes some random term I take some template and with this it takes an input variable so input variables because you have something else so in this case our template would equals the template variable and our input variable is the stuff inside the girly bracket so that's this question okay so that's it for our prompt and now let's initialize what our llm shape so what is an llm chain here and llm chain for now you can think of it as something that connects the prompt with our large language model so in this case let's just uh foreign a bunch of uh parameters so we're going to do prompt equals something or else the lamb equals something and our streaming equals something and our employables equals something so I'm I'm gonna go through all what all of these mean and the reason why it's showing that is because it needs to be commas so our prompt is basically just our prompt uh which is the variable I Define here uh and our slm which is going to be the regular open AI model but there's a different way to define it now instead of you know going through the hassle of doing a bunch of different uh you know API calls all you need to do is just open Ai and when we pass in our temperature temperature here and my bad streaming is supposed to be uh parameter inside the alarm itself so uh streaming you can just set to true and this would you know stream yard and temperature will set the one for that after this verbose is basically uh our hot process this will make more sense when we cover the agents tutorial but you can think of verbose for now as just this extra additional text that goes into and you know helps the llm with resume so the thought process that leads up to the answer and then we're going to take this llm chain and then we're gonna store it in a user session variable called llm chain so that we can access it on the on message call so in our on message call where you know this we've done this one already let's define a mean another main function and there are message B string whatever and inside this we're gonna retrieve the chain from our user session so llm chain equals CL dot user session dot get so this time we we did get instead of set and in here we're just going to build llm Shake which is the variable that we passed in here okay so that basically just gets us this variable across tags and after we've done this it's just pretty simple now uh all we do is just uh call it a desert result variable so instead of calling our model itself we'll now be calling our llm chain so what we do is await llm underscore chain dot asynchronous call and then inside this we're gonna do our message and then call back so don't worry about what this callbacks thing is it just helps with uh streaming uh it helps its streaming because you know it just calls back and establishes a socket an action from what I understand with it and then afterward we do this all we do is just return or output as we've always done so CL Dot message and then we're just gonna sell you that and then here we're going to do result address um text so as you can see the although this may not save a lot of you know lines of code and might be slightly more we have a much more organized framework to think about things now because once we adopt this land chain framework uh you can you know sort of do stuff that's a lot more complicated than what we're doing within similar lines of code so an example of this is what we're going to be covering next uh you obviously you're not expected to understand if this right now but all of this is very close to uh how many ever we're doing here and this is infinity more complex than what we're attempting to do here so once we're done with that let's um go back and fetch our API key okay so once we're done with just putting our API Keys uh we can actually test this out and look at all of our new features so let's change the drawing and then change that run and I file name here's 19. integration right yeah Dot py and then we're just gonna put a watchdog there then so just wait for that to run okay great so now if we put high as you can see we get everything that was in the media that I showed you so hi what is your question about you know a thought process here is there's a question about this thought process stuff is going to be more relevant once I look into other Concepts but this is basically the same thing you can just say uh what is your name how are you doing so and it's actually streaming here and then passes us the final output bad yeah so everything that you did previously with your API you can sort of uh test out working with our system message here uh you can desktop working with different user messages different parameters like dog B Etc and now let's get into something that's slightly more interesting we're gonna be able to use this line chain framework to ask chat GPD questions on our own PDF and text documents that could be you know any size like 2000 pages so more on that now so now let's talk a little bit about battery databases and embeddings here are some examples that you might have heard of we're going to be using chroma DB and some of the others for this course so what are vector databases what two databases are basically this database or storage for specifically embedding information Vector databases allow us to query and utilize the this embedding information as fast and efficiently as possible so what's an embedding well and embedding is just this multi-dimensional space where all the similar objects are grouped together so in this case um in the Simmons all the symbols that represent two are sort of grouped together all the symbols that represent four are grouped together and all the symbols that represent nine are grouped together so why is this important well when you have the ability to group some similar objects based on a bunch of different parameters together you can build recommendation systems you can build search engines if this is also used in L Adams themselves in generative Ai and for this specific case we're going to be using it for context window expansion so if you remember from one of the previous slides in the introduction I said that fine tuning is not a very recommended way of enhancing a model's knowledge well this is where Vector databases come in because while fine tuning has its disadvantages because of catastrophic for gaming Vector databases just simply retrieve the relevant context information for the language model so it can use it what do I mean by this well let's get into this let's get a little more technical into this so say I have this 2000 page PDF which I want to use and I have a couple of questions about it and I want to be able to ask sharp CPT or any other Ram uh this question in general where if you'd want to do this I guess the naive way would be to just copy paste every single letter in that um in that book and just paste it into the chat GPT terminal window but obviously this won't work because you'd be hit with a context limit because the there's a max number of tokens that can be inputted into chat GPT and the way to get around this context help it work with new information is where Vector database is coming so basically I'm just going to put this PDF into a text splitter which will split it into text chunks of equal length so you can imagine like just five words sentences or a thousand word sentences and once these chunks are made they put into something called an embeddings generator and these embeddings are then stored in our Vector database so now what's cool about this is we can actually ask the question and the vector database performs an action called cosine similarity and what cosine similarity is is it finds the nearest 10 or however how many are you want uh 10 relevant sentences Within These embeddings so the 10 closes sentences and then it just outputs them so that's what the embeddings and the vector database do and this Vector database returns those 10 relevant sentences and these 10 relevant sentences go into a question answering model along with our question so that it can be answered and the reason why this is important is because we went from a 2000 page book to 10 relevant sentences based on any query and to me 10 sentences is obviously a lot more manageable and useful in a model than you know a 2000 page book so why use Vector databases I have said before like cosine similarity is the only function that Vector B databases actually perform along with storage so why can't you just put all these embeddings into something like an array and do a cosine similarity function with them it's certainly possible but the reason why Vector databases are just so popular and being used so much is because they have clever algorithms that help us retrieve all these relevant text information at super fast speeds along with a lot more efficient memory efficient usage additionally it's also very convenient because we just we can just retrieve those 10 relevant sentences through one simple function call so this is going to be the architecture for project 2 as I've already mentioned we're going to be able to we're going to build this entire Pipeline and one very important distinction I want to make here is the difference between our embedding generator and Q a model if you remember from the number classification example I told you that the task was digit classification and it's able to group those uh you know symbol similar symbols together this is very similar but here you're doing that next um you know that next token word prediction problem and this is there's an important difference here because the embeddings generator neural network and the Q a model are very different the embeddings generated neural network for this case we're going to be using something called Adda 2 because it's a lot more efficient and it's sort of um you know the standard for generating embeddings whereas the Q a model is going to be something like gpt3 and gpd4 um in in theory it is definitely possible to generate embeddings through gpt3 and gpt4 but add a uh tends to perform better and is a lot cheaper so now let's get into the code so now let's get some more hands-on experience with how Vector databases work basically what we're going to be using for this tutorial is chroma DB which is an open source Vector database that we can run locally on our machine as pretty scalable for production so the way we get started is we just do pip install chroma DB and it should be installing everything but if you ever face any problems with installation and I say this because I have personally faced a problem of editing out here when I try to install this package for the first time the way you fix that is running this command that I've commented out here so if you do that the you know the errors will restart the clocks out once we do that we can set our chroma client equal to you know just our Chrome idb so that we can start querying our our question so here we can do collection and then inside this you do chroma client dot create a collection and you can think of a collection as basically this place where we store our embedding so this is the actual uh Vector database so so my correction will be our Vector database this is supposed to be chroma okay and now since we say that it's supposed to be our Vector database we can add information to this Vector database in the form of three variables so there is a documents variable there is a metadata variable and then there's an ID variable so our collection object contains all these three and let's go through them one by one and understand them step by step so document takes a list a metadata sticks on this ID stakes and arrest so document is the actual list of our documents if you remember from the slide explanation that I gave you all of our you know tokenized and split text go into this document so with this example I can just say my name is akshat maybe and another thing I could add is my name is not akshat and then we can uh we can Okay so let's just do that and uh this is supposed to be documents by the way my bad and after we've done this obviously this you can you can put as many documents as one and here they're gonna put our metadata so in this example it's not really going to be important and I'll explain to you why metadata is going to be very important when we uh then we're able to when we're able to build our question answering system so for now let's just put all of our resources so it has to be one source per document my source and you know what maybe let's just change this to name is true and here let's set it to name his fault so this will not be this will not affect any Vector database retrieval or competition this just acts along with the steps and they're each you know ID um each each document has to have a unique ID so id1 ID do and once we're done with that we can uh sort of look into what metadatas is so in so let's just uh imagine that we finished our uh you know PDF retrieval problem which is that we have uh successfully been able to use strategy PD to answer questions on our PDF even if you know this model achieves state-of-the-art performance it's still prone to a maybe a little bit of hallucination and there's obviously a risk of that happening and sources help combat that because even if the model hallucinates it will be forced to uh sort of output The Source from which it got its information so even if it misinterpreted the information the user can go to that source and find out exactly what uh the source is and sort of interpret it themselves so that there is no misinformation being spread so that's it for collection and now we can we can pass in a results object here and inside this results variable we can just query The Collection so collection.query and square your object ticks and do things it takes in query uh texts they're just going to put a singular text in for now but you can pass multiple questions in here and then end results so end result is the number of results obviously and query text this is your question so here this will be what is my name right so once we do that we can then uh Delete all this and then just say print results okay so now let's run the code and see what this has to say okay so it returns us with this so what it's happening is it takes this collection variable it queries it so what that means is it takes the what is my name uh string and it performs a cosine similarity function between what is my name and my name is akshat and it returns to me the distance the metadata the embeddings there's no embeddings for now and in the documents so the embeddings are not being stored at the moment but we will be using Adda in the next example to do this but for now imagine that you know this is something that was just learned so what what the other thing we can do is you know make this a number of results equals two and we can compare which one is more similar or closer to answering the question so as you can see here my name is akshat has a distance of 0.83 but my name is not akshat has a system distance of 0.93 so since this distance is closer you know the lower the distance the closer it is to answering the question uh this is going to be favored and in a top case situation we're gonna be passing this sentence here into our language model over this one so that was it for you know just the fundamentals of vector databases now let's apply this and whatever we've learned so far in this course is something more exciting we're going to be able to build a PDF and text bot that can answer any of our questions on a given PDF or text file of any length so now we have everything we need to get started with building our first document question answering system which is Project number two so for the sake of Simplicity um we're just going to be using uh PDF and text files as input to our project but certainly everything that you learn here is going to be easily generalizable to you know any other text file that contains text so before we get started let's import the necessary packages so there's this package here and if you just skip to this one tutorial insert without watching the previous one you'd want to install chroma DB uh pip install chroma comma DP yeah yeah sorry so control chroma DB and once you do that you're going to want to run this command here this is very important but only run it if you're facing any errors with you know um if you're facing any errors with coma DB so anyway let's get started uh we just imported our packages here and let's look into what's happening so just a quick recap on our architecture we have here that uh you know this PDF gets passed into a text better embedding generator gets stored in the embeddings query uh you know Vector database finds the closest um set of 10 sentences to this query using cosine similarity then those 10 sentences along with the query are wrapped into the Q a model and then q a model like gbt for lgbt3 it just it then passes the answer out as our output so let's look at what's happening there so our text splitter as I've said before is you know this this element here and we get to choose what the chunk size is which is the size of the sentence chunk so here I just put a thousand characters but you can uh you know you can experiment with different types and then we Define our embedding Sayer here so open AI embeddings as a default uh uses the adder adder to model and that generally provides the best performance along with you know the cheapest cost outbreak so after that this is just a welcome message is this user interface and here we've defined two functions so uh here we've defined a process file function as well as a doc search function so what I'm going to do is I'm just going to zoom out momentarily to just you know give us a bird's eye view of what's happening okay so the process file is basically uh this you know this feature that La uh this feature that chain that offers where you can just input a bug so you can assume that most of this is this boilerplate till here but after this uh Point here is where we actually call the text figure so everything this function is doing is just taking the you know taking the embeddings here uh doing the text picture and applying it on our PDF file and then once we have all the chunks you can then label them as sources so more on this later let's move on to our doc search so doc search is obviously where we you know retrieve our data from our embeddings so once we've sort of split these documents into the smaller chunks we can then process our file here and you know set it in our user session so don't worry about this too much this just make sure that I our docs are available to both the mod around the client and the doc search uh you should be pretty familiar with this if you uh gone over the previous segment so all we're doing is we're just uh using the literary we're defining like retrievers here and the chroma DB takes all these embeddings uh along with you know the model of embeddings and it just Returns the relevant query so that's what this this does okay so now is the actual you know chain that user interface interactions part and I've divided this whole uh action space here into two functions so let's just zoom in and see what each one uh is about so one second so here we're defining an on chat start tag and basically what this does is you just um pass in a bunch of different you know bottle bits text zero you can just put whatever you want so uh welcome do this space you can eat use this to check with your idiots okay and so once we send this message we can move on to yes all the more important lines here so here as you can see it says while files equals none you'll want to ask for a file and here's where you can change uh what kind of file you're actually inputting so after this tutorial I would recommend you guys to do some research and then experiment with using CSV files so here we're just accepting text and media with the max size of this much but you can obviously ask for more and you know just just send that out as a message so basically once this is done we can uh once the user has inputted a pile we then say processing and then the name of the file and then we basically just call these two functions here and return all that output still here so you're just doing um my bad okay so here so you're just doing you know you're just setting our we're getting a retrieval chain ready and setting it here so more on the retrieval chain now this is a new kind of chain uh that we've learned language so the before we just did a basic you know other than chain but here we're doing the retrieval q a with sources chain so as I've mentioned before chains basically unite The Prompt and you know the model together but now we can add a sort of other layer of abstraction to this definition where chains basically combine prompts llms and other functions to this together so in this case it combines a retriever which is which are uh you know these embeddings here so this is a retriever which is our Vector database our model is the chat GPD you know this model and our retriever is you know this doc search as a retriever function so once we're done with that we can then you know just pass in pass this out and set this chain save this chain into our user session so that our backend can actually access this so now this is what the back end looks like let's let's go through this step by step because we've pretty much done all the work already you know we've established everything here and here is the place where we're actually going to call all our functions so here as you can see this is an on message function so whenever a message happens uh you would get fetch this chain this is retrievable chain and then you would you would basically just uh you know just strip your API and make the answer presentable and then just you know give it to the user but all this we could we could just stop it here you know we could just stop it here but a problem with large language models currently is that sometimes very rarely they do hallucinate and make up information when the context sent this sliding around and the way we can get around this is to provide citations to the user about what's actually happening so even if the large language model collusionics if the user is interested in doing further research which you know he should be for the sake of this example he can just click on the sources button and then he can read what's actually you know the the actual retriever output so uh that that's what this section fear does whereas sources you know we just classify our services here and it outputs our solution so all of this will make more sense once I uh done oh bye I don't think I don't remember defining any open the eyed key here so might ask Nate to do that yep open API key let's just go ahead and get an open API Okay so openai.com login and once you're in the login page you can just hit API and then view API keys so I've created one already but I'm just going to create a new one which I will definitely disable as soon as this video ends so create secret View and they'll just use this fail on or did I I mean okay this should work let's let's try doing this again okay I think I know what's happening here so import OS so generally when you're using packages I think it expects you to you know give it in this one rather than what I Define so just so that it's easier for it back just to identify what the keys so let's for the last time hopefully let's try this again yeah okay so it works we'll and this opens up here so welcome to the space you can use it to chat with your PDFs great so this is exactly what we've intended it wanted to do so now what we can do is we can just drop in a PDF file so what I did here was I just passed in you know this Sapient huge history of humankinds just a short extra profit a PDF but since uh this process is running locally you're pretty much unbounded by how much you data that you can input into this model so you'll see that it's processing here and what processing basically means is it's putting these text things into chunks is generating embeddings and passing the embeddings into the vector database so that we can you know perform all these steps here so let's just wait for that to happen one node is that at the time of recording this um I don't think chain that supports the drag and drop feature into files although it does claim to do that so your best bet is to just hit the browse button and then choose the file in the you know window that displays so anyway uh sapiens it's processed so we can you know ask it some questions so a question from this book would probably be like what is the in the Indian Theory and so as you can see here it shows me the retrieval q a chain with sources chain and you know it should be shorter it sort of shows me its hot process here which is exactly what this is so let's just go deeper into this Shin so the retrieval with sources come by is a combination of the stuff documents chain and the llms chain so that they can return our sources so let's just look into the actual output here and you know it so it answers the question obviously but what's more important here and what's more interesting is are the sword lists so I can say you know Source 42 it's not that great at formatting because you know the PDF is just a non-structured text and also a tokenization and chunking could sort of affect the alignment but this is pretty easily you know fixed you can just remove Extra Spaces so again I recommend you to you know play with this and sort of test out what you can do to improve this further or I would you know initially like you guys work on is maybe removing these spaces between uh between these text spaces here and just keep asking it questions and see how it responds with uncertainty as well so if I ask it something like Port is CPM till it is uh safety I'm the spawn in a bowl with citations so I have no idea what this answer is kind of output but yeah so yeah it you know Jordan answers the question but as you can see it doesn't really respond in a pawn which is kind of a litigation and you can sort of experiment with how well you know you can make these and how flexible you can make these uh a quick fix to this would be you using an agent that first you know retrieves these sources and this answer and then you would pass this back into the llm so that it can make it a bomb so you could you know experiment with that although we are going to be covering agent structure so that's all for our q a with documents so now we're getting into some of the more advanced things that we can do and we're going to be looking specifically into the web browsing and agent capabilities of llms so why would you want an uh an agent to sort of browse the web well the reason is because there is a knowledge cut off uh in with the training data and you can't train a model on an infinite amount of data because that would be very expensive and time consuming instead you you'd want to train a reasoning engine rather than a database and you'd want to train something that can just have access to all these resources and synthesize it themselves rather than having some internal knowledge base that uh that you know can't be changed additionally some of the information that you might train an alarm on might be biased and might be susceptible to you know changes in the future which it hasn't accounted for yet and browsing the web helps us reduce this because it always has access to the latest information so now we're gonna go into agents which allows uh these LMS to perform Chain of Thought reasoning well here's an example of what I mean this is from the auto GPT GitHub repository welcome to this brief demonstration of Auto gpnc today we'll be showcasing the ai's learning ability by asking it to research information about itself let's begin the program's first step is to use Google to find relevant websites to what it's researching foreign the GitHub repository for auto gbt it's opened up for analysis after scanning the website's contents it is summarized to them and will now place them in the auto gpt.txt file we have opened as we can see auto GPC can teach itself about different topics using the internet allowing it to have a better understanding of the current world than chat should be deep so order GPD is what we call an agent and the way it works is this really cool way where it prompts itself so the user just gives it a prompt and it can formulate a plan and then input that plan back into itself and then output an action input that action back into itself and it can arrange all these chains of thoughts into a pretty sophisticated workflows that you can use to answer questions and now read more recently you can actually perform you know actions in the real world so the first problem that we're gonna try to solve in this basic course is what is our lecture we're gonna be we're gonna want to ask the agent this question and it should be able to use its tools to find the answer on the internet reason why it doesn't know the answer to what is rlhf is because reinforcement learning from Human feedback or our lecture is a New Concept that has only been discovered or invented in 2022 to 2023 so chart GPT having its knowledge cut off in 2021 would not be aware of this whatsoever so we're trying to empower it to sort of research and learn about these things so that can answer us so how are we gonna do this we're gonna use Lang chains archive API or integration to do this so archive is an is a platform that allows us to have access to over 2 million scholarly articles in various builds from physics to economics to computer science and it pretty much details all the latest research and information that's available to us here is what the website looks like just in case it might be a little familiar and now that's go to the solution so we need a model that can know its resources and utilize its resources and furthermore it needs to be able to plan how to use these resources sequentially step by step prompt by prompt and when I say prompt by prompt I mean it can it should be able to prompt itself to finish the task without the need of explicit programming as well as not being stuck stuck in loops so thankfully all of this is going to be covered or sort of has been taken care of by line chain already but uh as it has conveniently implemented for us a research paper called the react framework that we can just use um to make any other laminated so here's what um something like an agent uh here's one algorithm that you could use or you can just pause this video and look at this in more detail but now let's just start with a simple example and dive into the code so now we're ready to actually Implement project number three which is going to be GPT researcher so gbt researcher basically answers our questions on pretty much the latest fields of Science and can provide us with citations if we ask it to so here's what we're going to be using if you haven't uh ever heard of archive before it's basically this website where there's like a bunch of scholarly articles that everything from you know physics to math to economics and it has pretty much all the latest research so let's get started here's the archive Plugin or integration in chat GPD and light chain and I've made a notebook I I made a notebook in your GitHub repository so that you can use it so backwards skipping ahead here let's just yeah here okay so here is internet browsing let's zoom in a little bit so that we can see the code better okay so we're just gonna import our basic functions here and we're gonna just Define our open AI API key so let's try to find it if I put it somewhere here and we're gonna put this back into our online okay so once we've defined the API key we can just look through some of the basic things this should look super familiar to you you know we've done this a bunch of times already uh this llm definition and after that we have this special parameter here called tools so basically the tools you know parameter allows us to store tools that the llm and the agent can access based on its needs so one thing that you can do maybe after learning the other tools in this course is you know just separate them with commas and pass in some of the other tools like wrap it that's not exactly how you define it but you can uh you can sort of integrate that into this toolbox so once we Define the Dual uh we Define a new type of chain which is our agent chain so as I've said before a chain basically unites our prompts with our large language models and our algorithms so in this case the algorithm that we're using is zero short react description so this is the algorithm that allows the llm think and you know prompt itself to reach the right conclusions so we're gonna set the max iterations to 5 so that it doesn't you know get stuck in a loop and it doesn't um it we you know it won't be built too much for uh your API usage so we're gonna be just passing our tools here with our llm and then Max iterations we'll post we'll get to that and handling this you should set this to true as well so after that all we do is just agent chain dot run with whatever question we have and let's observe uh the output so let's let's share it on this so while that's running let's just look at like sort of the uh to the many steps that it takes so this algorithm is called zero Shard react description so all we're doing is just asking it first word uh is you know are the Chip And if it doesn't already know that then it would identify an action to perform and so the action to perform here is archive which is our API that searches the research paper domain and then you would input the word r a ledger into our archive API so that it returns a bunch of you know research papers that are search results obviously this is not really presentable I I shouldn't have to you know look through all of this research to get my answer so what it does is you know it observes this output under observation and then it has a thought here so our leadership stands for reinforcement learning so it takes in all of this research that it's just found and then you know it it just finds uh itself an answer to this so that's what that that's how we're able to solve this knowledge Gap problem that we have an additional thing that you can do here is yeah you can just ask it anything you want so what is a black hole for example and it will give you the answer so let's just observe what the algorithm is doing first so it's it identifies that it searches it should search for a scientific definition for that it identifies the correct action this will make more sense when you give it more actions so that it actually has to decide between what actions to choose so it inputs the word black hole into the action and you know all this bead text here is whatever is written by archive so it looks at all this return text and then it thinks meaning that it it you know it um it takes in all this input and then it just gives us an output based on what it thinks so our final answer is a black hole is the reason region of space-time exhibiting gravitational acceleration so strong that nothing no particles or even you know the electromagnetic radiation can escape from them an interesting thing you can do here is you can query sort of the other types of data that come along with this research paper so you can see like with titles you know what were the authors in this paper and was it when was it published so that's all for the archive tool let's move on to the other tool that I'm gonna show but before that let's look into uh chain lit integration to just show you how easy it is to incorporate agents into you know a presentable UI so this is a chain that integration all this just boilerplate you know and all this code here we've already we ran that in the previous python notebook so you can just copy paste it here the other thing you need to remember is to store this in an agent variable and then let's see if this in the agent variable and the answer should be agent.org everything else is just boiler but it could uh and yeah this allows you to spin the final answer as well that's pretty cool so let's now run our agent with uh with chain with so again go to your terminal and then just type in chained run Internet and browsing okay I apologize for the overly long file names but just so then item number which one contains what I should open up this and okay so yeah as usual same error uh let's just copy paste our open API key back into this book here and this time let's make it a lot more convenient by adding that widget slide so it should open it up again and now it should be working yeah so again whatever we uh just created through the asian.run we can do the same thing so what is iron HF and you know I'm not familiar so it's just using archive and then it says title whatever the research that it found and then you know I now know the final answer so this is the final answer again what is uh black hole yeah so you know it answers the question for me and we can look at all the hot process so one final note before we move on to another tool is this parameter here which is where both and basically when you set it to false all it does is it removes all this thought process here and it just gives you the answer and it's useful in a production setting but I think it's kind of fun to just look at vertex thinking and how it's able to come to those conclusions so that you know you can verify whether or not choosing be like so another tool that we're gonna use is human as a Duo this is really cool because say your agent serve goes on the wrong track what you would want to do is you want to validate it and sort of tell it what it's doing wrong so that it can fix that thought process and then move in the right direction instead of having to just run once and then you changing the parameters you can just talk to it and sort of make it work and this is what the human has a tool uh this is what the human has a dual package solves so this is just a simple you know scenario here uh you can look through the code but obviously the only difference here is that in the tools array I've just put humans and other than that too and I've just defined this method I'm here so that I can do the math so let's just run the human as a tool.pi and I'll tell you exactly learning again I hope a night it'll open the attitude I'll remember to add it to the other ones so open aiti key and let's try running this again Okay cool so it's working uh so it says I to figure out what the math problem is and solve it and the reason why is because I wrote the question is what is my math problem and basically what all I can do is I can just say if I have three apples and we will how many more B filter how Danny this is a very basic example but obviously uh it observes this and then it uses the llm map calculator too and it shows me that the answer is five now the answer is two so the reason why this is cool is because you can use this tool as I've mentioned before in between its thought processes so if it uh thinks of a wrong solution you can instantly give it feedback and that's where you can sort of just add this human to to your Cobra cluster tools here another tool that we can check out is our adapter tool so this is the python droplet tool and I'm I think you're sort of noticing a trend here where you don't really need a lot of code in order to deploy all these very complex uh agents because blackshin has got this covered for us already but if you want more customization with your tools we're gonna jump into that a little in a little bit so let's just start running this now and I'll explain what this does you can think of it as an alternative to collab notebook so basically what it does is it writes python code and the limitation with chat GPT is that it can't run its own port right because it's not a computer so it takes that code it puts it into Rebel and it retrieves the output of that code back to us so that you know it codes for us so here my question was what is the 10th Fibonacci number and here's you know a list of Fibonacci numbers um you know it identifies to the action Spike on Rebel and the actual input is the alphabenacci so this is the code that goes into their apple and the observation is as such so the thought is that you know it just prints out Fibonacci Okay so that was the python replied to obviously all you have to do is just combine this with uh every tool that I've mentioned before in order to make it work so another cool tool that we can explore is the YouTube search tool so YouTube search tool it obviously you know it does what it says it does it's just YouTube and in order to make this work you actually need to install the YouTube search package so let's just install that and the cool thing about YouTube search too is that you can uh sorry give it this framework here so this is another way that you define a tool where we Define an array of tool objects a tool object is basically this uh you know object that takes in three parameters so in this case it takes a name takes a function and it takes a description the name can be anything I don't think it matters that much the function is what um what the activity that the language model actually does when it's called and the description is super important because it tells it when to actually use it so in this case when you're asking for YouTube video links you'll want to call this Duo so that's why our description is you need to give it links to YouTube videos and we wanted to put https and all this stuff in front of every link to complete them because it only gives us you know the URL address like whatever it would be it doesn't give us the YouTube like so it's not more convenient to click on the links when you have that so again initialize the agent with all the tools of this stuff and I'm just gonna ask it a simple question here which is what's a Joe Rogan video on an interesting topic so let's let's just try to run this have I put my opening I can no then second okay so we have our opening API key and let's just share it to another list now so as I mentioned before you know this it just Returns the root value in our observation so it just returns to slash watch and what I asked it to do here is to put https youtube.com in front of every link so that it's actually clickable so our final answer is this so let's just click on it and see what it gives us experience I guess I'm talking about it I must so yeah it works it returns to us the YouTube link and I've shown you how to do this already but all you have to do to integrate this into chain lit is just copy paste all of this and replace this this stuff here with uh you know all this archive stuff that's all you have to do it's super simple let's go into more of the integration so I can show you what you can do to what you can do to research and explore more tools there is the there's the zapier land natural language actions API tool that I recommend you guys to explore on your own I'm not going to be doing this one because it's not the safest to show my API keys for so zapier is basically the service that gives you over 5 000 different Integrations I'll be covering the national language actions API in a future video but you know you can sort of combine this with maybe a research uh you know the archive research retrieval by making a bot that every you know single week it just Returns the latest research on a specific topic and it can like tweet it out put it as part of your email newsletter zapier has all those Integrations handle the last tool we're gonna end on is going to be the shelter and once we're done with the shell tool I'll just show you guys how to make your own custom tools so let's go to the shop tool code is in the CLI GPD dot Pi file and this is sort of interesting because you're doing a little bit of configuration here so you're taking the shell tools description you're adding its arguments and you're pasting these arguments with these so that it's easier for the model to execute the local commands so what the shell tool does is basically what we're doing in this terminal except chat gbd can actually run commands in our terminal you know create text files and all of that so here's a basic example obviously you know we'll just initializing the agent or adding all of them it should be a lot more familiar when I just do that out there so once we do that uh if we can just run the agent and my question to the agent is create a text file called empty and insert it add code that trains a basic CNN for for ebooks so as you can see there's no there is an empty.txt file from uh previous test run of this so I'll just delete that and let's run the code again so see that GPD on that um so let's let's do that and maybe buttercups here so okay this one would because it's before 8 or so I'll just put this okay so let's let's run this again okay so let's enter in the execute to chin and this is interesting because uh we can actually look at what exact commands that it's passing through so that this is sort of a step-by-step step sequential reasoning problem where first it it uses one two which is the terminal 2 and and this is the commander passes in so touch D dot dxt and then once the text file has been created it uh puts all of this text into empty.txt and then you know it's thought step and then it just opens until here is all the code we're going for running that so let's just check yeah this is actually worked so empty.dxt is here and yeah so it does have all the code that's required I'm just gonna end uh this tool section by showing you guys how easy it is to yeah again put all of these agent tools inside shame with so everything I'm doing here is just you know all this stuff is boilerplate and all I'm doing is just copy pasting what I'd put in CLI gbd all this code here and I'm just putting that in here and then I put the agent dot running here and store it in this variable so that's that's let's just uh for once and Define a API key and then run it in chain so chain that turn on on so I've named this file called the nation output dot pi and go for the Adobe effect so but yeah we can basically do everything that we did uh previously in the previous python notebook but just Enchanted so we can say what is are not bought as well uh to know installed I will install numpy package in my environment so as you can see it's okay through and see what it can do so it knows that the action is installed numpy so it just does that and I wanted to show this to you guys in specific because this creates a really cool use case where you can um you can sort of automate the process of learning how to record and uh speeding up this process by using this so say a beginner doesn't really know how to you know install packages or configure their environment you can just use this terminal tool and launch it as an application that helps them build that I think um when it comes out GitHub co-pilot will be uh or should be using this tool to make sure that it's very easy to manage dependencies and just make the llm handle all the point first word so that you can focus on actually coding so that's it for the CLI tool now let's look into word is pretty exciting which is our custom tools so I'm just going to give you guys a super basic example of a custom tool but know that you know it's pretty much Limitless what you can do with this so let's just uh stop running this code and let's look into custom tools so one important thing here is you want to enable Lang chain tracing to true so that you can actually Trace through your function so that it knows uh what to execute so in this case again uh same Imports same llm initialization and this is where you get customization so now in your tools array you can actually Define your own custom tool so in this case in the Dual object you can pass in a name a function and a description our name can be anything you want and our function is the parsing multiplier so this is just a simple function that multiplies two things together not supposed to multiply yeah so it multiplies two things together so and a and b so when you pass in the input three times four a general calculator would not be able to answer this because a it's a natural language and B it requires like a little bit of intelligence because uh you know this is a number and this is a series of strings so this is our multiplier function and then parsing multiplier basically takes in these two takes in a string uh that the llm inputs into the parsing multiplier and just splits it based on the comma and just extracts A and B so that it can multiply so an example of this is basically the llm uh I'll just explain what that or what I just said means so the llm would look at these functions and then it would out it would you know act as an input in the function it would probably do like three comma four as its input to the function so this C comma 4 as a string so this is a string three comma four gets passed into our parsing multiplier where it's split into a and b so 3 goes to a and 4 goes to p and then you would return the multiplier of A and B so since so you would just convert them into integers so you just return three times four so this is just one Custom Tool that you can do obviously you can do a lot more complex things you can call apis on the back end you know you can hook it to email you can do a bunch of stuff with this so let's just run this llm chain now and see what it shows us custom tools fail to the lower default bucket so it works I'm not quite sure why uh this would air out but it it does show us uh the actual uh agent chain so here it says I need to multiply two numbers together because that's what it's doing and the action is multiplier and and input is three comma four so the observation would be 12 because your action input 3 comma four gets passed into our parsing multiplier which is our function here parsing multiplier splits it into a and b Returns the multiplied output and so the observation is 12 because the output of our function is 12 and then it just says you know I know the final answer final answer is 3 times 4 x 12. so you can change this function to maybe dividing obviously it's not the most creative change but you know when you can Define your own functions and allow this allow them to do things the possibilities are pretty much Limitless uh interesting thing I just saw here is that I did Define this as division but you know it's smarter than that because I think it focuses more on the goal than the actual means to get to the goal so it um you know even though the observation is 0.75 it prioritizes its own like common sense and knowledge maybe through our Ledger techniques uh to make sure that the silencer is not dry so it's 12. so I'm just gonna leave you guys with this this was the end of custom tools and the last segment of this course I'll just leave you guys with maybe one thing to think about which is the how the agents actually work right now is to zero shot react description so you can think of this as basically a pretty elaborate prompt combined with the recursive algorithm so what I would leave you to is instead of instructing these values uh do you think they'd be a way to sort of fine tune the llm based on the output of multiple agents because uh fine tuning is meant for Behavior changes so could you just reinforce that logical thinking Behavior inside an llm itself so I won't have depend on outside react Frameworks in order to work if you've made it this far then congratulations because you just finished the course here's a list of everything that we covered so far this course has been brought to you by loop.ai which is software that allows you to create no code AI flows in 10 minutes you'll be able to perform all the actions that I've covered in this course so far add a lot more with no code as well as in within 10 minutes or less you can deploy and embed a custom AI flow on any website and deploy an embed a custom chat bot on any website from Twitter to Instagram to your own personal website to Shopify I appreciate you guys taking the time to learn from this course if you're excited about the product head on to loop.ai and sign up for Early Access subscribe to our YouTube channel for highlights of the course uploads per section announcements and updates on the product the YouTube channel is also going to be the place where we're going to post any of our future courses follow us on Twitter to be the first to be notified about all of our future events and courses and if you have any doubts queries concerns regarding any part of this course join the Discord for assistance where me and my team are going to be working every day to resolve these doubts and conflicts once again I appreciate you guys taking the time to take this course thank you

Info

Channel: freeCodeCamp.org

Views: 143,126

Rating: undefined out of 5

Keywords:

Id: xZDB1naRUlk

Channel Id: undefined

Length: 122min 54sec (7374 seconds)

Published: Fri Aug 18 2023