GPT 4: Superpower results with search

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

today we're going to take a look at a example Walker app that is going to show us how to alleviate two of the biggest problems with GT4 GP 3.5 and other large language models those two things I'm talking about are their ability to very convincingly make things up which we call hallucinations and also their inability to contain up-to-date information so most of the models we're dealing with at the moment they haven't seen any world information World data since September 2021. that's up to that point that's where their training data cuts off so they're pretty outdated so what we are going to be able to do with the approach I'm going to show you is take a question like how do I use the llm chain in line chain now lime chain is a very recent python Library so most of these models the train you're going to cut off is it September 2021 they have no idea about lung chain llm chain is a particular object within that Library if I ask gpt4 how how do I do this the answer isn't very good so the answer is that a Lem chain in line chain is ambiguous term likes context it could refer to a language model so it did manage to get that which is kind of cool it could be a blockchain technology this is the answer that I seem to see in GPT models quite a lot that this is some sort of blockchain Technology um so assuming that LM chain refers to a language model and Lang chain refers to a blockchain technology then it gives you instruction on how to use it like this is just completely false this isn't useful in any way to us whatsoever so you know this isn't good with the approach I'm going to show you we will get this answer to use the LM chain line chain follow these steps import necessary libraries do this uh create initialize your LM create a prompt template you import the LM chain initialize your llm chain and then run your llm chain that's exactly how you do it so what we're going to cover in this video is how to make that happen so the question now is you know what are we doing what are we going to do now as I mentioned large language models they kind of that exists in a vacuum they don't have any sort of external stimuli to the world they just have their own internal memory which they they was built during the training of this large language model that is kind of all they have and it's pretty powerful that I mean you've seen chat GPT uh now gpt4 like the things that they can do is is incredible right their general knowledge of the world is very very good it's just not up to date and it's not always reliable sometimes they just make things up so what we want to do is give the large language model access to the outside world now how do we do that well we're going to use a few different components here the the main component is what we call a vector database and we're going to be using what is called the pine convector database for that and essentially you can think of this as in within your brain you kind of have your you have your long-term memory uh somewhere in there um you can think of Pinecone as your kind of long-term memory storage the large language model is I I know maybe it's like your short-term memory maybe it's also like the neocortex which kind of like runs your your brain or performs all these logical calculations within your brain that is kind of how we could think of these two components and how they relate to each other um and then we're also going to okay so let's say we we take a query uh we're going to take this query down here typically we just put that query straight into the large language model instead now what we're going to do is we're going to have another large language model that has been built for embeddings now and embedding you can think of embeddings as kind of like the language of language models that's kind of what they are these these vectors um they basically create a representation a numerical representation of language so let me it's probably better if I draw that out so you have this embedding model here and given your query it's going to map that into essentially what is a vector space okay so it's going to put it like here based on the meaning of that query so we create this this Vector embedding and then we take its pine cone now in Pine Cone we already have many of these Vector embeddings that we've created beforehand all right so uh let's say this is kind of inside Pinecone right there's all of these different vectors everywhere and they all represent a piece of information now what we're doing here is we're taking that we're putting it in here so it's saying like here and then we're saying okay which of the vectors that are nearest to our query Vector okay and maybe assist one this one and this one and then we return those so those three items they come out to here so we have our our vectors and they are connected to some piece of text relevant text to whatever our query is we then take our query bring it up here and we feed it into the large language model alongside these pieces of information that we just retrieved right so now the large language model has a way to it has some sort of connection to the outside world in the form of this Vector database which is retrieving relevant information based on a particular query okay that's what we're going to implement I think I think that's enough for this kind of abstract visual for this let's just jump straight into the code okay so I will leave a link to this notebook so you can you can follow along uh it will be somewhere near the top of the video right now there are a few things that we we need to import here okay or install so we're going to be using beautiful soup um you know you saw the question before it is it is about a particular python Library you know where do we get the information about the python Library well we we just go to their dots foreign chain readout.io and you know they they have a lot okay there's everything we need is here right it has guide it has code it has everything so all we're going to do is just scrape the website right and obviously that website that Dot site is pretty up to date for the library uh so you know we can just keep something that goes through and maybe updates it every you know every half a day or every day depending on how how up to date you need this thing so we're going to be using beautiful soup uh tick token open AI line chain and pycon client I'm going to go through all these uh later as we come to that cool so I don't want to take too long going through like okay how getting all the data and so on because obviously it's going to vary depending on what it is you're actually doing but I just show you very quickly I'm using requests I'm getting the the different web pages we come to here and I'm basically just identifying all the links that are to the same like linechain.read.io and getting all the links on each page that direct to another page on the site and then I'm also just getting the main content from that page right so you can kind of see here the the front page welcome to langjang content it's getting started modules like it's it's super messy and I'm sure 100 you can do better than what I'm doing here this is really quick code and I most of this even like the the pre-processing data scraping side of things uh chat this is all mostly chat gbt not even me so this is just kind of like pulled together really quickly and we get this pretty messy you know inputs right but large language models are really good at processing text so I don't actually need anything more than this which is it's pretty insane so I'm just taking this and putting into a function here scrape we have a URL which is just a string and we go through and we extract everything we need there right then here I'm setting up that Loop to go through all the pages that we find and just scrape everything and we add everything to data here right you can see if we scroll up there's a few four fours where it can't find a web page now this might just be that I'm calling the the wrong wrongly formatted URL or something else I'm not sure but you know I'm not too worried just just like a pretty quick run through here all I want is that we have a decent amount of data in here and we do so let's have a look at what one of those looks like so data this is the third thing we third page that we scraped yeah it's it's really messy right it's kind of it's hard to read I think there's code in here uh yeah I mean there's code and everything in here it's hard but it's fine it works that's actually we don't really need much more but it is very long and there are token limits to GT4 the model work I'm using is a 8K token limit there will be a new model it's a 2K token limit but we don't want to necessarily use that full token limit because it's expensive right they charge you per token so we don't want to just like pass in a full page of text like this it's better if we chunk it into smaller chunks which allows us to be more concise in the information that we're feeding into gpt4 later on and also save money like you don't want to just throw in everything you have right so what I'm going to do is we're going to slide everything into not 1 000 token Insurance actually running a little bit lower so 500 total insurance now here we are actually using I'm actually using line chain they have a really nice like Tech splitter function here so let me let me walk you through this um because I this I think most of us are going to need to do when we're working with Text data so we want to take our big transfer text and want to split it into smaller chunks how do we do that well first we want to get the open AI because we're using open AI models here we want to get the open AI tape token tokenizer to count the number of tokens that we have in a chunk so that's what we're doing here we're setting up this this basically counting function which will check the length of our text and we're going to pass that into this function here so what is this function this is called the recursive character text splitter and what this is going to do is it's going to try first to separate your text into roughly 500 token chunks using this character string right so double new lines if it can't find that it's gonna it's gonna try a single new line if it can't do that it will try space and if it can't do that it's just getting split wherever it can so this is probably one of the better in my opinion options for splitting your text into chunks how it works really well with this actually with this text it's probably not even that ideal we don't really have many I don't even know if we have new lines in this right so this is probably just mostly going to split on on Spaces but it works so it's not we don't need to worry about it too much cool so we process our data into chunks using that approach uh so we have this here we're just going through all of our data right we split everything we are getting the the text records so I don't know if do we have an example okay yeah here so if we come come to the format of this we have the URL and then we also have the text right so that's why we're pulling in this this text here and because we now have multiple chunks for each page and we need to create like a a separate chunk for each one of those but we still want to include the URL so what we do is we create a unique ID for each chunk and we have that chunk of text that we've got from here we have the number chunks so you know each page you're gonna have like five six seven or so chunks and then we also have the URL for the page okay so we can just link back to that at a later point if we want to do that all right cool and then we initialize our embedding model here so we're using the opening AI API directly what we're doing here is using the an embedding model okay so text embedding are the zero zero two now embeddings are pretty cheap I don't remember the exact pricing but it's it's really hard to spend a lot of money when you're embedding uh things of this model so you know I wouldn't worry too much about the cost on this side of things it's it's more when you get the gpt4 later on where it starts to get a bit more expensive so uh this is just an example how do we create our embeddings right so we have open AI embedding create we pass in the text embedding order 002 model you also need your opening IQ so for that you need to go to platform open AI dot AI I think uh sorry openai.com let me double check that okay so you'd come to the platform here you'd go up to your profile and the top right and you just click view API Keys that's it okay and then we run that and we'll get like a response that has this we have object data model usage uh we want to go into data and then we get our embeddings like this right so we have this is embedding one or zero this is embedding one okay because we passed two sentences there each one of those is this dimensionality and this is important for initializing our Vector database or our Vector index okay so let's move on to that and we we get to here okay so we need to initialize first initialize our connection to pine cone for this you do need to sign up for an account and you can get free API key so to do that you need to go to app.pinecone.io and we should find ourselves here and you'll probably end up in this like or say your name default um project or something yeah default project and you just go to API Keys you press copy and you would paste it into here and then you also need the environment so the environment is not necessarily going to be this I should just remove that the environment is whatever you have here and this will change okay it depends on when you sign up among other things so yeah that will vary so don't rely on what I put here which was the US West one gcp it can change and it also depends if you already have a project that you set up with a particular environment then of course it's going to be whichever environment you chose there all right after that we check if the index already exists if you've if you've just if this is your first time walking through this with me then it probably won't exist right so the index is this Jeep T4 line chain dots you can see if I go into mine it will be there right because I I just created it before like recording this so I I do have that in there so this would not run all right but the important things is we have our index name you can you can rename it to whatever you want I'm just using this because it's descriptive I'm not going to forget what it is a dimension is where we need that one five three six which we've got up here so the dimensionality of our vectors that's important and then the metric we're using dot product so text embedding R to zero zero two you should be able to use it with our DOT product or cosine we're just going to dot product there and then here we are so after this we'll create our index then we're connecting to our index okay so this is grpc index you can also use just index but this is kind of more reliable faster and so on and then after you've connected you can view your index stats now the first time you run this you you should see that the total Vector count is zero right because it's empty then you know after we've done that where we move on to populating the index to populate the index we will do this right so we're going to do it in in batches of 100 all right so we'll create 100 embeddings and add all of those to Pinecone in a batch of 100. okay so what we're going to do is we Loop through our data set through all the chunks that we have with this batch size and we find the end of the batch so this the initial ones you'll be like zero to 100 right we take our metadata information now we get the IDS from that we get the text from that and then what we do is we create our embeddings now that that should work but sometimes though there are issues like when you have like a rate limit error or something along those lines so I just had it added a really simple try except same and in here to just try again okay cool after that we've got our beddings okay that's good and we can move on to so we we clean up our metadata here so we within our meta data we only want the text maybe the chunky I don't think you we rarely even need a chunk but I'm just putting it in there and the URL I think that's important like if we're returning results to a user it can be nice to direct them to where those results are coming from right so it helps a user have trusts in you know whatever you're sort of spitting out rather than not knowing where this information is coming from right and then we add all of that to our Vector in depth so we have our IDs the embeddings and the metadata for that batch of 100 items and then we just Loop through keep going keep going in batches of 100. right once that is all done we get to move on to what I think is a cool part right so how do I use the LM chain in line chain let's I think we can we can just run this okay and let's have a look at the responses now this is kind of messy here we go so I'm returning five responses now if you see the first one here I don't this is not that relevant okay the top one that we have here okay fine come on to the next one this is talking a little bit about large Lounge models I don't think it necessarily manages LM chain here fine move on to the next one now we we get something right LM combined Chains It's talking about blockchains are what I would use them we talked about the prompt template which is a part of the Alm chain and it talks a little bit more about the llam chain right so that's the sort of information we want but I mean there's so much information here that we really want to give all of this to a user you know I don't think so right we want to basically give this information to a large language model which is going to use it to give a more concise and useful answer to the user so to do that we create this sort of format here for our query all right so this is just adding in that information that we got up here into our query and we can have a look what it looks like so augmented query right it's that's actually kind of messy uh let me print it maybe they'll better right so you can kind of see I mean these are just single lines it's really messy but we separate each example with like these three dashes and a few new lines and you know we have a list and then we ask we put our query at the end how do I use the LM chain in line chain all right that is our new augmented query we have all this external information from the world and then we have our query before it was it was just this right now we have all this other information that we can feed into the model right now gpt4 at least its current state is a chat model okay so we need to use the chat completion endpoint like we would have done with gft 3.5 turbo and with those we have kind of like assist the system measures that primes a model right so I'm going to say you are a q a bot you are highly intelligent the answers you to questions based on the information providers so this is important based on the information provided by the user above each question right now this information isn't actually provided by the user but as far as our AI bot knows it is because it's coming in through a user prompt right if the information cannot be found in the information provided by the user you truthfully say that I do not know okay I don't know the answer to this right so this is to try and avoid hallucination where it makes things up right because we kind of don't want that it doesn't fully fix that problem but it does help a lot so we pass in that primer and then we pass in our augmented query we're also going to do this so actually let me run this we're also going to do this here so we're going to display the response nicely with markdown so what we'll see what gpt4 is that it's going to kind of form everything nicely for us which is great but obviously just print it out it doesn't look that good so we use this okay and let's run and we'll get this okay so to use the airline chain line chain follow these steps import necessary classes I think these all look correct FBI temperature 0.9 now all this looks pretty good I'd say the only thing missing is probably that it is missing the the fact that you need to add in your openai API key but otherwise this looks perfect right so I mean that's really cool okay that's great but maybe a question that at least I would have is how does this compare to not feeding in all the extra information that we got from the vector database all right we can try all right so let's do the same thing again this time we're not using the augmented query we're just using the query and we just get I don't know right because we we set that set the system up beforehand with the system message to not answer and just say I don't know if if it doesn't have the information contained within the information that we passed within the user prompt okay so that's that's good it's working but well if we didn't have that I don't know part would it just maybe it could just answer the question maybe we're kind of limiting it here so I've added this new system message Ur QA bot a highly intelligent system that answers user questions doesn't say anything about saying I don't know let's try Okay cool so line chain hasn't provided any public documentation on LM chain nor is our known technology called LM chain in their library to better assist you could you provide more information or contents about a Lem chain line chain okay uh meanwhile if you are providing to line chain a blockchain based decentralized AI language model I'm you know I I keep getting this answer from from gbt and I have no idea if it's actually a real thing or it's just like completely made up I assume it must be because it keeps telling me this but yeah I mean obviously this is wrong this isn't what we're going for it says here if you're looking for help with a specific language chain or model in NLP like this is kind of relevant but it's it's not really clearly doesn't know what we're talking about it's just making guesses so yeah this is just an example of where we would use this system and as you saw like this it's pretty easy to sell business there's nothing complicated going on here we're just kind of calling this API calling this API and all of a sudden we have this insanely powerful tool that we can use uh to to build like really cool things it's getting stupidly easy to create these sort of systems that are incredibly powerful and I think it shows there are so many startups that are doing this sort of thing but at least for me what I find most interesting here is I can take this I can integrate into some sort of like tooling or process that is specific to what I need to do and it can just help me be more productive and help me do things faster and I think that's probably at least for me right now that's the most exciting bit and then of course for anyone working in the company or anyone any Founders working on their soap and so on like these sort of Technologies are like Rocket Fuel like things you can do in such a short amount of time is insane anyway I'm gonna leave it there I hope this video has been interesting and helpful so thank you very much for watching and I will see you again in the next one bye foreign

Info

Channel: James Briggs

Views: 25,210

Rating: undefined out of 5

Keywords: python, machine learning, artificial intelligence, natural language processing, nlp, Huggingface, semantic search, similarity search, vector similarity search, vector search, gpt4, gpt 4 python, gpt 4, gpt 4 openai, gpt 4 launch, gpt 4 chat, chatgpt 4, james briggs, llm, retrieval augmentation, openai api, llm gpt, gpt 4 code, gpt-4, chatcompletion, openai chat, gpt 4 hallucinations, gpt 3 hallucinations, openai gpt 4, gpt 4 access, generative ai, gpt 4 tutorial, gpt 4 test

Id: tBJ-CTKG2dM

Channel Id: undefined

Length: 27min 10sec (1630 seconds)

Published: Thu Mar 16 2023