End To End Advanced RAG App Using AWS Bedrock And Langchain

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello all my name is Krish naak and welcome to my YouTube channel so guys in this particular video we are going to discuss an end to endend llm project with the help of AWS bedrock and Lang chain this was also one of the most requested video from many people out there what we are going to do in this particular project is that we are going to develop a document Q&A application where specifically we'll be harnessing multiple models that are provided by AWS Bedrock like cloudy laru and you can also use Amazon Titan whatever models you specifically want that actually provides we can actually use them and we can implement this entire application so to go ahead with I will just go ahead and show you the quick demo over here so what is this specific entire application this is a rag right we are trying to create a RG system wherein we have multiple PDFs all those PDFs is stored in the form of vector emings inside Vector store and whenever we ask any query we will be able to harness the power of AWS Bedrock which provides Ides different different models like cloudy Lama 2 and uh based on this particular question we can retrieve that question from the PDF files so let's say I'm asking what are Transformers and if I go ahead and click on cloudy output that basically is going to hit the Cloudy API cloudy model and I'm going to get the entire response okay so currently this is running if you probably want Lama to Output I can also go ahead and click on Lama 2 and I can actually get that specific response right so whatever questions you you have with respect to this particular PDF documents what the set of PDF documents also I'll show you and you'll be able to get the entire response over here right so uh why it is taking some time because I still need to write that optimized version of the code but once it runs right for the first time I think then after that it works absolutely fine now let me just go ahead and write what is YOLO right so if I go ahead and click on Lama 2 output before see I showed you also Lama 2 uh and in my local system itself over there it was taking a lot of time but this is directly coming from the API model itself so here is what is the response I'm specifically getting and I'll be able to get the answers now this is what I'm going to develop completely from scratch already if you remember guys in our previous session in our yesterday I had uploaded a video where I showed you the power of cloudy. py how to invoke the model and all uh then we also saw with respect to Lama 2 and all now let me just go ahead and create a new file over here so let me just go ahead and write this as app.py now inside this app.py I'm going to write my entire code now before I go ahead I really need to do some installation with respect to some of the requirements as said uh here we are going to use not only boto 3 and AWS CLI we are also going to use some more libraries like P PDF so let me just go ahead and write this over here so we require Pi PDF along with P PDF we also need Lang chain and then we will also be requiring streamlet right uh along with this we also require fire fire CPU right so this is what we going to use because from our local environment we're going to do the vector embeddings so for this I'm going to use the FB right so through this only we'll be able to create the vector iding into the vector store so these are some of the basic requirements that I actually require now uh aw CLI is also there because we really need to configure it now let me quickly go ahead and open the terminal let me clear the screen because yesterday I was creating the images over here the first step again you have to create a virtual environment do all this pip install and all right I hope I don't have to tell you all those things now because I repeated many number of times so let me go ahead and write pip install minus r requirement. txt if you are not able to understand it guys please make sure that I'll be providing you this entire playlist in the description of this particular video you can go ahead and watch it okay but uh you have to follow the playlist right again I'm not going to repeat everything from scratch otherwise it is just a waste of time right so till this installation is basically taking place let me just go ahead and talk about what all things we are going to basically develop right so this is the document Q&A search so in short my PDF will be stores in stored in the form of my entire PDFs we probably consider this my entire PDF will be stored in Vector store and from this Vector store we can specifically query any information that we want along with this we are going to harness the power of langin Lang chin along with the llm models from AWS Bedrock okay so this is what we are going to do in this project so please make sure that you prepare well uh understand the architecture what we are trying to do so our entire project deals with two important step one two and before this we are also going to do a very important step which is called as data injection right now in data injection what we are going to do is that we going to read from the entire folder how many PDFs are there all the PDFs we going to read it okay so once I probably get this entire PDF then my next step actually starts over here I'm going to take all those documents the PDFs over there split this into chunks create the embeddings and here we're going to use this fir database so here we're going to use this FES right the vector database and then through that only we'll try to create this Vector store right so step by step I'll show you how to do this what embeddings also we are going to specifically use here see understand we are going to create the embeddings so for embeddings we are going to use today I will show you a new model which is basically called as Amazon Titan okay so that you understand you have multiple options you have you may have implemented multiple options with respect to that because tomorrow in companies if you go definitely they going to use this a Bedrock because it provides a lot of features right so for creating this embeddings I will specifically use Amazon Titan other than this if you don't want to use this you can also use open AI embeddings right it is up to you I've also shown with Google generative AI embeddings multiple embeddings techniques in my playlist with Google respect to Google Germany and all so after we create this Vector store now how we going to use the llm so in the Second Step whenever we ask any question first of all the similarity search will happen from the vector store whatever relevant documents and chunks we get we have to take this chunks give it to my llm model along with the prompt like let's say that I say that okay please summarize this entire information based on the query that I've asked in 250 word so this llm model along with this promt is going to take this particular data and is going to give the answer itself right so this both I'm going to develop completely from scratch step by step we'll try to see it okay so uh let's quickly go ahead and let's start start our coding without wasting any time okay so here is my app.py I said that I'm going to do it completely from scratch so uh there will be some installation there will be some uh errors that may be probably coming up so we will try to also import all the libraries based on the pipeline that we have created based on the steps or architectures that we have already discussed so first of all quickly let's go ahead and import Json so I'm going to import Json I'm going to import OS uh I'm going to import uh sis I think I'll not be using sis but let it be these are some of the common libraries that we'll specifically use along with this we're going to use boto 3 and again guys in my previous video I has have shown you how to configure the AWS CLI please make sure that you watch the playlist otherwise you'll not be able to understand okay if you directly go into and jump in this particular video no you'll not be able to understand so in the description I will give you the playlist the the first video only I've shown you how you can actually configure the AWS CLI okay so please make sure this okay the next thing is that as I said right we will be we will be using using Titan embeddings mod model right for creating vectors or to generate embeddings okay so we will create this model the reason why I'm showing you because I've never used in any of my videos now this Titan embedding what I will do I call from the Lang chain library or Lang chain framework Lang chain provides you multiple functionalities multiple fra multiple options to probably interact with bedrock right AWS Bedrock so as I said this framework is compulsory for you all to know one is Lang chain and Lama Index right so what I'm going to do I'm going to write from Lang chain dot embeddings I'm going to import Bedrock let's see bed Bedrock embeddings right so I'm going to specifically use this this will be small letter okay so guys I'm also verifying it from the documentation okay so there is a documentation that is given not very much clearly but I've implemented this entire project also right so from Lang chain. embeddings import Bedrock embeddings the next thing that I'm going to use from L chain. llms like we can also use Lang chain to call this llm model that are present inside Bedrock so that is also integrated right from llms dot Bedrock import Bedrock okay I'm going to specifically use this two one is from lin. embeddings bedrock embeddings and from lin. lm. Bedrock um I'm going to import the Bedrock okay so this is specifically for the embedding part okay now there will also be some libraries that I need to import for data injection because I really need need to load the data set right so over here you'll be able to see I will write import numai as NP okay uh this is numai I'm going to specifically use but I think by default numpy will be available import numai as NP perfect so numpy as NP then from Lang chain dot text splitter I'm going to use text splitter so that as soon as I load the document I need to probably use this recursive character text splitter okay so this is what we specifically use and in my Lang chain series I've discussed everything about this then again I will go ahead and write from Lang chain dot document uh document loaders import um we will specifically use this spy so let me just copy this P PDF directly loaders so this is what we are specifically going to use P PDF directly loaders and it is actually present inside lin. doent loaders okay so this is specifically required for the data inje with this what I will do I have already created one folder over here which is called as data which has this two PDFs right I need to load all these PDFs from this particular folder and then perform all the vector embeddings that I required so in data injection first of all we will load it we will split the entire documents by using recursive character text splitter and then after this we will convert this into Vector embeddings so let me just go ahead and write vector embeddings and Vector store here I'm going to specifically use f synex okay F DB or chroma DB also you can use it is up to you again I've shown both the ways in my playlist so from Lang chain let me just see this okay so Bedrock embeddings is there something called as from linore Community it was right let's see okay from from Lang chain Community let's copy this is this correct or not we'll try to see yeah bedrock and Bings was there uh let's copy this here also so Pi PDF directory loader is also there I think there will be some warnings that will be coming uh I think most of the some of the LI Lang chin Community okay so here what I'm actually going to do from Lang chain let's see if you get an error again I'll revert it okay from lin. text splitter sorry not text splitter why I'm using text splitter already I've actually done it so I'm also going to use for this Vector embedding specifically uh I have to use Vector uh FES right so I will write Vector stores import FES okay so f is one Library I'll be using again this F will be present inside Community I guess okay so Vector linore Community uh do Vector stores import F let's see f is not accessible so I think it is inside Lang chain only okay otherwise you can see the documentation okay at the end of the day no worries then after doing this I will also write from Lang chain do indexes okay I'll not use this also let's just use F because we have already installed F CPU okay then um after doing this what we are specifically going to do is that we are now this is for the vector embedding now we need to go ahead with the llm models right so for llm models the lanch already provides ways to load models from the AWS Bedrock okay so for here I'll write from Lan chain from Lang chain Community Dot one I will specifically use prompts oops from Lang chin. prompts I'm going to import prompt template okay and then the other one is from L chain dot chains I'm going to specifically use one chain where I'll be using import retrieval QA right so since I'm going to create a Q&A chatbot with the documents uh Q&A in short so that is the reason I'm using this and this is basically to create my own prompt template now let's call the Bedrock client so that we get the exess of all the models okay so I will go ahead and set up the Bedrock clients over here so here let me go ahead and write Bedrock is equal to boto 3. client and here I'm going to give my service name is equal to bed bed Bedrock Das runtime okay so I've already shown you yesterday also uh in in my playlist of this Bedrock I've also shown you how you can actually call the client itself and access the models okay now as said as I as I already told that the embedding that we are going to specifically use is called as Bedrock embeddings right so let me quickly go ahead and call this I will go ahead and write Bedrock embedding and let me show you how you can actually call this embedding from the AWS Bedrock so I will Cy paste this let me give the model ID now this model ID you'll be able to see it over here right so if you probably go ahead in AWS itself let's say I will go ahead and write embedding okay so here on in Titan I think you will be able to find it out so Foundation model base models so if you go ahead and search inside this any of the embedding models you can specifically take but I'm going to use this Titan embedding model okay so if you you probably click over here and see down you'll be getting the entire details like what is the embedding what is the model ID and all okay so I'm going to specifically use this so let me quickly go ahead and call this model ID so for this model ID I've already copied it from that Pro this particular website so here you can see I will have copied this entire thing and I will paste it over here on my model ID so you also have to do the same step okay so once we do this the next thing in this that we really need to give is the client and client you know that we have not yet called any client or what so we have called this Bedrock client so this will basically be my client over here right so once we give this Bedrock client that basically means it knows we are going to call this particular embedding model from this particular client that is bedrock itself in short we are going to use the AWS llm uh AWS Bedrock now let me quickly go ahead and this is what is all step by step we'll be going first we we will create this data inje model see if this step is done we have created our client now we will go ahead and implement this data inje so quickly let's go ahead and implement this data inje and please make sure that you follow this so here I'm going to basically write data injection okay and uh I will create a step which is called as data injection a function now inside this data injection what things you're specifically going to do right I have a folder so let me go ahead and write loader is equal to Pi PDF directly loader and here I'm going to give my folder which is called as data right from that data folder only I need to pick up all the PDF files right so this is the first step then I will go ahead and write documents is equal to loader. load right so from that this loader we are going to load this entire documents now in our testing uh we will specifically use some something called as character text plate right so that is what we are we have already implemented right recursive character text plat now how do you implement this it is very much simple I will go ahead and write text splitter right and this with the Lin I've already shown you so it will be recursive text splitter and here first of parameter will be my chunk size chunk uncore size chunk size I will give it to 1,000 uh 10,000 right and along with this chunk size the second parameter I would like to give as chunk overlap please give this number a little bit more value so that you'll be able to understand it okay so this two are there text splitter recursive uh character text splitter and this right once we get this text splitter I will go ahead and write docs is equal to text spitter do from oh sorry do splitcore documents okay and then here I'm going to give my entire document so in short we're going to split it right based on this recursive character text splitter once we do this then we will return the docs all the docs we're going to specifically return the docs so data injection is done right with respect to this particular data injection here what we are doing we are reading all the PDFs from the data folder we are doing recursive character text splitter with this chunk overlapping and then we split all those documents right so this becomes this complete our data inje now the next thing is that we will go ahead with vector embeddings vector embedding and Vector store right and this is where we are going to specifically use that Titan embedding that we have imported along with that what we are specifically going to do is that we going to use this FES right so let me quickly go ahead and write it down so here I'm going to write my definition and quickly let's go ahead and write get Vector store so I'll create one more function now inside this Vector show I'm going to first of all give the documents whatever documents is coming from here because we take this documents and then we do the embedding techniques and we perform all the embedding techniques so in my next step what I will go and write I'll write Vector store underscore F okay so this will basically be my variable and then I'm going to use f. okay from underscore documents right and we are going to to implement this right so in the first parameter inside FS we give docs and the second parameter that we specifically give is my Bedrock embeddings right so this is the embeddings specifically we are going to use okay one is the Bedrock embedding which we have initialized over here see over here in the Bedrock embedding we have initialized and that is what we are going to use it over here now after we get that Vector store FES this is basically my Vector store I will go ahead and save it in my local disk so so let me quickly write do savecore local and inside this we're going to basically write Fiore Index right so this will basically be saved in my uh folder over here itself right in my hard disk you can also save this in any database as such if you want so this is my Vector embeding I've done it and this step is also completed now let's talk about the next step in the next step with based on the UT we have to work with llm models now what we are going to do over here is that I'm going to create some llm models let's say I will go ahead and write first llm model that I'm going to work with is cloudy right so I will say cloudy llm and inside this cloudy we will create create the anthropic model because I should see already AWS Bedrock is giving you the power to harness multiple models to use the multiple models so we can specifically develop in this application itself it's just like having different different models in different different ways but here I'll try to make it much more generic so here we are going to write create the entropic model and I'm going to create my llm along with this we are going to use this Bedrock so Bedrock um before if I probably show you in my previous tutorial the way of invoking a model was something like bread dr. invoke model but now since we already using Lang chin we don't have to basically use invoke model so what we specifically do we just call this bedrock and you can probably see how where this Bedrock is present it is present in Lang chain so they have actually created a wrapper and internally they will invoke that specific model but this is a way that how you can use this Frameworks like Lang Ching in a generic way so now I'm going to use this model as Bedrock now with respect to this I will go ahead and write my model ID already I have searched my model ID so this is what is my model model ID over here how you do you get this information again from this particular examples go and click on any model you will be able to let's say I want to generate Cloud generate code so with respect to this this will be my model ID right so it is a generic way of finding the model ID so you don't have to even worry about it right so I have my model ID which I did it in my previous tutorial also and then I will go ahead and write my client my client will be nothing but Rock bed rock and the next thing that I'm going to specifically give after this right one is model arguments right so model arguments with respect to this now this arguments this arguments that you'll be seeing right where do you find it again if you probably go over here inside the body and if you go at the last right there will be some arguments that will be added like this Max tokens to sample temperature this this you can also add it over here now with respect to this particular model that I am actually using this was the argument that was present inside that particular body so that is the reason I had created this right all this Json file separately so that I can refer it okay so I hope you're able to understand it if you're able to understand please make sure that you hit like till now and now let's go ahead and here what we are specifically doing Whenever I Call this function that basically means my cloudy model has got loaded so I'm going to return the llm model now suppose similarly you want to probably go ahead and call the Lama 2 model so I will go ahead and paste it over here I'll say get llama 2 llm like this you can actually create any number of functions as you want okay so this will basically be my Lama 2 now inside my Lama 2 let me just copy this again I'm going to see my so this is the Lama 2 model over here okay only the model ID change and whatever is the argument that will change so in the case of arguments uh with respect to this this is nothing but Max gen length right how I'm getting it let me show you again okay so let's say this is chain of thoughts let's say that I'm going to use this okay so here you can probably see this is my model ID okay and then inside my arguments this is from where we'll be able to get it right this API request is pretty much important with respect to this okay so return llm model and this is where is my LNM model that I'm getting with respect to llama 2 so now you know how to create how to call all any uh models itself right data inje is done Vector store is done everything is done now let's go ahead and create my prompt template now this prom template that I'm actually going to create I'm going to use something called as Lang chain so in my prom template I've written see this okay so I've given a simple prom template which you can also use it so I've written human in this format use the following piece of context to provide a concise answer to the and at least summarize with 250 words with detail explanation if you don't know the answer just say that you don't know just don't try to make up the answer so context and question is there based on this assistant whatever assistant output is there it'll get appended over here now with respect to this particular thing we use the langin prompt template we write context and question and we use it okay now this is fine everything is good now let's focus on the response part so here I will go ahead and write get under _ response _ llm and here the first parameter is my llm the second parameter is my Vector store Vector store FES the third parameter that I'm going to give is my query right so this three parameters we are going to play with one is obviously if I want to get the response from a specific llm then I have to give this three information okay get response llm the llm model which I'm calling this is my f index and this is my query okay now once we get this we as you know that we have imported something called as retrieval QA so retrieval Q&A here also we going to use it and this is how you basically call it right so I'm just going to copy it paste it over here again I'm seeing the Lang chain documentation so retrival QA from chain type I'm giving my llm model chain type will be stuff I've already shown you what is text summarization different different text summarization techniques so stuff is there over here this is the most important thing retriever how from where the similarity search will basically happen so this is where we are doing right A F index basically has the entire index itself right and then we are trying to do the similarity search along with the argument top K3 top three prompts and then we also saying return Source document with respect to uh this one and then I have my prompt over here right so all these things are basically done and then after this we will be able to get the response so now once we specifically write this chain type uh arguments and here we are specifically giving the prompt The Prompt is getting created by the prompt template over here and then in the next step we will go ahead and write answer I'll create a variable called as answer and we can use this QA variable and inside this we will give in the form of query colon and here we're going to give the query itself so this query is the query that is coming from the as an input okay and once we get the answer see when we when this QA is basically using right it is nothing but retrieval QA so it will retrieve the response and it will store into in this particular variable now inside this variable there will be something called as result as a key which will give you the output so if you print it right initially I printed this answer I've already implemented this so there there is a variable called as result which has the entire answer so this is my response llm where I'm getting the entire result now let's go ahead and create our stream L app Now quickly there are two important things with respect to the streamlet app one place I have to make sure that whenever a document is updated know that it should get converted into a vector embeddings so what I'm actually going to do quickly over here I'll create a main function and let me show you what I will be doing okay so quickly first of all I will import streamlit as St streamlit as St fine we are going to use streamlit now see this the thing that I've copied and pasted so I've written chat PDF chat with PDF using AWS Bedrock this is my user question like what kind of question I want from the PDF file then a sidebar is basically created and I will say update or create Vector store right that basically means inside this I will create a button saying that vectors update that basically mean once I click right then it should go ahead and call this data injection now data injection what it is going to do it is this data injection is going to read all the files from the data folder then it is going to take this load documents it is going to perform this recursive character text spitter and then after providing after doing this it is going to return this entire documents right after this we are going to call this Vector store so one by one we will be calling it so here you'll be able to see data injection and Vector store we are calling we get this docks now inside that Vector store what is happening is that we will save this in our local uh folder itself in the hard disk in the form of f index so as soon as I click this button that is Vector update which will be available in the sideways inside the sidebar sidebar means in streamlet it is in the left hand side so the vectors will get created the vector stores will get created and it will also saved in a local folder itself right so over here a folder will come with this particular name that is called as F Index this is one step now the second step is that I can go ahead and create a cloudy output okay I will create a cloud a button which is called as cloudy output now see this this is important to understand okay so this is my button I've created another button when I say cloudy output that basically means I have to use the Cloudy model API so the first thing as soon as I click on the Cloudy output button my f should get loaded from the local so that is the reason I'm writing do loore local with this F index and the Bedrock embeding the same embedding which I actually used then I will go ahead and call this cloudy llm right I'm going to call this cloudy llm because I will get the llm itself now if I go inside this function if I go ahead and write F12 okay so here you'll be able to see where it is uh cloudy llm so cloudy llm it is going to call this particular model and return our llm over here right so here I'm going to get my llm with respect to this and this llm will specifically be used inside this function that is get response llm where I have given my llm Vector store FES and query right when I get this three information over here I'll get my response and I am writing this particular response over here very much simple very much easy right so let's execute this and let's see whether this is working fine or not okay so this is my main function so what I will do I will create this if uncore name with respect to this main now let's see whether we will get any error okay so I will clear my screen but in short we have executed each and everything right so I will go ahead and write streamlit run app.py let's see whether we'll get any error first of all uh okay it's working fine I will close this okay now this on the left hand side I will get this Vector update update or create Vector store and here I have chat PDF using AWS Bedrock okay now the first step as I said what I'm going to do here is my data folder it has this PDF file attention. PDF yolo. PDF so what I'm going to do I'm going to probably click on Vector update as soon as I click on Vector update now what is going to happen see over as soon as I click on Vector update what is going to happen so if this is clicked that basically means the data injection folder it will go it will read both this PDF and then it will convert that into vectors and we will save that Vector over here right now you cannot see any folder in the name of f index but now if I go ahead and click on Vector update now what is going to happen see as soon as this Vector update will finish now processing is basically happening and this will probably come up with like done okay done status unless or until we don't don't get any errors let's see with respect to this right what will happen so it is now loading all the PDF it is performing it is converting those into vectors right and then it has now stored that as a f index now if you probably go over here and see this folder my f index is created it has index. F and index. pickle right now this is the first step now the second step is another button which is basically cloudy output now cloudy output is over here now if I give any prompt what is attention is all you need what is attention is all you need okay if I go ahead and click on cloudy output now it is going to call the Cloudy model right so if we probably go ahead and see the code again what is basically happening first we will load this F index from the local then we will call the Cloudy llm model then we will call this get response llm where we are giving all this three information the user question and now finally you'll be able to see I'm getting the output right now this is with respect to cloudy output right so similarly if I go ahead and write what is YOLO I will go ahead and click on cloudy output now again see follow this pattern inside it will go again it'll load this fast Index right and then we will call this particular cloudy model now how the performance can be improved this FC index I'll not call each and every time so let me do one thing let me call this once okay let me do one thing for the first time I will just keep it outside this okay but again it'll give us an error but I will give you that as an assignment first of all let's work on this okay so here you can see my YOLO is also coming now see if I want to add Lama 2 and all what I need to do I will go over here my Lama 2 function is already created if you remember see over here my llama 2 function is where get Lama 2 right so I will create another button I will copy it like this if my St do button is Lama 2 right so this will be my llama 2 output instead of calling get cloudy llm I will say get Lama 2 llm so now we will go ahead and check whether for Lama 2 it is working fine or not so I will save this quickly I will see now I'll click on Lama 2 now let's see whether we'll get the output or not again the same thing is going to happen first of all it is going to load the fire index after that it is going to take the llm model it is going to give all the information and then finally you'll be getting the answer similarly I can go ahead and write what is attention is all you need okay so if I go ahead and click on Lama 2 then again I will be able to get the response so this entirely everything is basically happening with the AWS Bedrock so step by step I have shown you almost everything the data injection step what all libraries are specifically getting used all this code I have written it in front of you the streamlet file we have actually created I would suggest go ahead and try it from your side and this is how you able to get the response but the best thing is that all these models are available in AWS bedrock and it is already scalable it you can actually use it according to your wish so yes this was it from my side I hope you like this particular video I'll see you all in the next video have a great day thank you and all take care bye-bye
Info
Channel: Krish Naik
Views: 25,072
Rating: undefined out of 5
Keywords: yt:cc=on, aws bedrock tutorials, aws bedrock ussing lanhcian, rag using langchain, rag app using bedrock and rag, end to end rag app
Id: 0LE5XrxGvbo
Channel Id: undefined
Length: 37min 23sec (2243 seconds)
Published: Mon Feb 05 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.