ChatGPT 3.5 Knowledge Update: Retrieval-Augmented Generation (RAG) Explained

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
okay so here we are going to look at one of the example for uh rag what I want to show you is uh that model let's say chat GPT I'm using particularly 3.5 3.5 has a cut off date as of January of 2022 it's clearly it's saying that my last update was in January 2022 and what I'm asking a question that what did President Biden talked about Chuck Schumer in his 2023 State of the Union there is a comment about Chuck Schumer in the 2023 version let's do it one more time chat gbt 3.5 I'm asking the same question that what Biden talked about Chuck Schumer I cannot give you that answer let's look at the State of the Union this is actually the text state of the union what you see I mean I have downloaded from the internet and maybe if we can uh look at uh what he mentioned of Chuck suumer now let me zoom in a little bit so it's saying congratulation to Chuck Sumer another you know another term as a senate majority Senate minority or Majority Leader yeah he joking around this but now what we want to solve is we want to we want chgb to answer this question but since CH doesn't know this we will provide this as an input to CH GPT and we will solve this problem now that is where or this is what it's known as a rag as a functionality so what I have is an entire notebook created I'm going to run all and what it's doing it's installing some of the libraries like linkchain open a Tik token some other experiment and and whatever it requires it's installing those libraries and from there I'm importing certain functions so I need some retrieval aspect I need a chat open AI as a as an object I'm getting a text loader because I need to do something with the text the information what we are giving to the to the model then I'm splitting that text so I need a text splitter I'm going to use open a embedding model so that I can convert the raw text into the vector form that is where the open a embeddings actually is applicable and there is another uh function called Vector store so Vector store is nothing but a vector database similar to any SQL database you might be familiar with Oracle SQL server but it's mainly required for storing the vector information so whatever embeddings we create that embedding will be stored into the vector store or vector database right suffice is the uh the vector store I'm using here and I'm using some other additional function for doing that functionality after that what I'm doing is I'm serving the P key and I'm using the same model GPD 3.5 turbo remember here we are using still gbd 3.5 so it's still this turbo model and gbt 3.5 is behind the scene it's the same thing next what I'm doing is I'm sending this content that content is nothing but stateof the union. txt what we saw in this one right so this is the state of the union txt the file name I'm simply giving that entire file name and getting that content I'm storing that content as the State of the Union file locally and I'm loading the data in my data loader I'm printing that file just to show that what is this the real file and it does say President Biden State of the Union and then d d d thank you you can smile it's okay that's exactly right thank you you can smile it's okay so this is the exact thing what we got and then what we are doing is we are splitting this text because in the vector database we simply cannot store this text as is we need to chunk it or create bytes uh in this scenario we are talking about a chunk size of a th000 token so we are splitting the text and and making sure the data what we have is in the various chart chunks so there are various chunks created as a part of this we got about 70 different chunks we are converting that text into embedding and then we are storing that into the vector store after that I'm calling my model chat GPT as a model because we are calling a chat open AI function that's nothing but chat GPT adding some additional what I would call a hyper parameter like temperature is one of the hyper now I have value of 7 so I'm using that as a as a functionality for now and finally I'm asking question what did President Biden say about Chuck Schumer and I got an answer that President Biden acknowledged Chuck schummer referring to him as the Senate minority in the bracket Majority Leader he mentioned that Schumer has slightly bigger majority this time around so if we go back to this here we go Chuck Schumer another you know another term as a senate majority minority leader you know I think you only this time have a slightly bigger majority Mr leader again this is a raw text it actually converted that raw text into a better output but the point is by still using the GPD 3.5 as a model here we go where is our model sorry I'm scrolling up and down but just want to make sure that show you that it's still we are using gbt 3.5 as a model here we go I'm still using gbt 3.5 turbo as a model which has a date knowledge part of date of January of 2022 and still we were able to retrieve this answer so in a way this is the entire flow of what we call retrieval then augmented and finally what we got is a generation or generated the point is and any new data what we want to feed or updated data we want to feed we can use a functionality like rag okay that was helpful hopefully that was helpful thank you
Info
Channel: AI with Ritesh
Views: 90
Rating: undefined out of 5
Keywords:
Id: Kp1qItXoJxE
Channel Id: undefined
Length: 6min 50sec (410 seconds)
Published: Sun Jun 09 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.