Getting Started With Nvidia NIM-Building RAG Document Q&A With Nvidia NIM And Langchain

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello guys in this video I am going to show you some of the amazing powerful features of uh Nvidia Nim U and again which was recently announced by Nvidia uh and if I talk about Nvidia Nim it is a latest breakthrough in the generative AI development okay Nvidia Nim is a set of inference microservices for deploying AI models and it revolutionizes the way how we deploy generative AI Enterprises along with this Nvidia name offers multiple AI models right it can be llm model LM model multimodel not only that it also provides you Nvidia AI Foundation model just with the help of apis you'll be able to integrate in your application you'll be able to seamlessly run it and it is quite highly scalable so in this video I'm going to talk about this I'm also going to show you multiple examples with the help of coding so please make sure that you watch this video till the end because this is an amazing feature and as I always say there will be many many llm models that will be coming up but the clear winner will be the company that provides the best best inferencing thing for us right so let us go ahead and this is the page over here you can see instantly run and deploy generative AI explore the latest Community built AI modules AI uh with API optimized and accelerated by Nvidia and then deploy any a way with Nvidia name uh you'll also be able to experience leading open source models and I will be showing more about this you know uh you can also do integration uh just by using an API callway uh along with this you'll be able to run anyway accelerate your AI deployment with Nvidia name um how to buy so here it is and the best thing is that just to for you to try it out uh once you probably create a page you'll be getting th000 credits so which will be more than sufficient to explore and probably call multiple models to you right so let us go ahead and let us see so first of all go to this particular page anyhow I will be giving you this link in the description of this particular video just click on try it now and once you probably go ahead and click on try now here you will be able to see all the models right so with respect to models you can see Lama 370b it has almost all the open source model Foundation models along with that NVA Foundation models also it has right even open source model also it has so here you'll be able to see GMA you'll be able to see edify images multimodel L model L model so here with respect to models you can see reasoning is there for reasoning you can use all these models visual design all these models are actually there retrieval right if you want to probably implement retrieval it is also there then here you have speech biology gaming multiple things are specifically over here so uh let me show you one example over here and then we'll also try to create an end to end application that will be a rag application using this Nvidia name uh so let's go ahead and let's start our project but before I go ahead uh let me just go ahead and show you one of the things right uh so before I start uh any project right it is very much important that you also need to have an API key now how to probably get an API key let's say that in my project in my rag application I want to probably use Lama 370 billion instruct right and and right now since the inferencing is happening in the Nvidia Nim itself so if you go ahead and click this particular model here you'll be able to see something like this right here you'll be able to chat so let's say if I say hi and if I send a message you'll be able to see that I'll be able to get a response uh how are you okay any question that you want over here how are you so here you can see this is the code right and when I'm writing how are you so this is the code that is basically getting set with respect to the content so you can use this particular code and you can also call it right and I'll also be showing you so once I probably send how are you you'll be able to get uh the response over here now when you go to this U build. nvidia.com first of all you need to log in right so here you can see I have logged in over here if you have not if you don't have an account I would suggest please go ahead and uh create an account and here you can see I have 954 credits left initially you'll be getting 1,000 when you are probably creating a new account okay now uh this is the thing here you can actually see right uh the entire uh code is also visible right so I will show you how you can run this particular code and again over here you can see that there is an API key that is required in order to generate the API key all you have to do is that click on get API key over here right so here you'll be able to see uh get API key in this green color and just click on this this key authenticates your Nvidia AI Foundation endpoint for test and evaluation and just go ahead and generate the key now I will be making sure that I copy this particular key because I'm going to use this particular key for my coding purpose okay now let's go go ahead and first of all let's go ahead and open my VSS code and I will show you how we can go ahead and create this entire project that code that you'll be able to see over there right this code also will try to run it okay so let's go ahead step by step let's go ahead and do this so first of all uh let me go ahead and create my environment so here I'm going to use cond so cond create minus P VNV python equal to 3.10 I'll just use 3.10 and let us go ahead and create this environment okay now over here you definitely require multiple uh multiple uh requirements mean multiple packages in the requirement. txt so we will also go ahead and update that okay so first of all uh let me just go ahead and yes one more thing that I'm going to show you the rag application that we are going to create we'll do it with the help of Lang Lang chain also so Lang chain also has an integration for that so first of all let me just quickly go ahead and write requirements.txt okay so let me go ahead and create my file requirements.txt so first of all what all packages I specifically required I'll be using open AI because in this particular code that I see right there was uh open AI over here so let me just first of all copy this okay I will be requiring this key so I've copied it in my noteb notebook so this is my entire code over here so let me just copy this entire code okay copy the code and paste it over here so let me just go ahead and create my app.py file and let me paste it over here okay and we'll we'll go step by step we'll understand what exactly this is okay so first of all in the requirement. txt I am going to use openi along with this uh you'll also be seeing that uh I will be using python. EnV the reason why I'm importing this because in the in my EnV file I will be creating that API key so I will be requiring that okay uh this is the next thing and along with this uh right now I'll just keep this two packages over here just to run that code now uh let me go ahead and write pip install minus r requirement. txt before that let me activate my VNV environment so cond activate VV okay so this is the activate environment that I will specifically be using now what I'm actually going to do just go ahead and write pip install P install minus r requirement. txt okay so the installation will specifically happen over here uh all the installation will happen and here you'll be able to see that once the installation is taken place I will go ahead and run this particular code because here I just require open now let's understand this particular code and uh what all things are there in this code so first of all I am importing from openai import openai and then we are creating a client with respect to openai the B base URL over here will be this integrate. api. nv.com slv1 which is given from the code this is my API key okay please make sure that you don't have this publicly visible API key instead what you can do you can as you know that I have already imported python. V so we can directly create an environment variable and use this okay so I will do that uh first of all I'll create an environment variable over here and let me do that first step so that I can actually use it when I'm creating my end to end project so this is my EnV uh this is my environment variable that I'm going to create right Nvidia _ API key and this is my API key okay and I can call this API key wherever I want um by using uh this python. EnV file okay and that I'll show you once I probably create a rack project now once I probably create this particular client then uh we have to use client. chat completion. create here we are giving the model name again it has been given the entire code is given over here see I did not write it any like Nvidia name is already providing this and just imagine just using this and directly executing it it's quite amazing then messages role is equal to user content how are you okay so I'm I've written this particular message or let me just go ahead and write U provide me a paragraph provide me uh provide me an essay okay on machine learning okay okay provide me an article on machine learning okay something like this okay so then we are setting up the temperature value top underscore P will be 1 Max tokens one24 and stream is equal to True when we are keeping stream is equal to true that basically means the entire completion will we will be able to uh see in the form of uh stream and this is basically there in open AI right then from Chunk in completion we are executing this particular code now let me just go ahead and run this and let's see whether we'll be able to get the output or not so here I'm going to just write Python app.py and um here uh see machine learning all the answers is coming over here it is streaming it is giving you the entire output this is perfect you are able to get the output so that basically means you're just able to execute it and how quick it is NVIDIA Nim trust me the inference is very very fast and that is what it is going to basically bring up a breakthrough an amaz amazing Revolution right uh in the entire process of generative AI development and at the end of the day company really needs to think about influencing and if I talk about Nvidia it is a king of gpus right so the inferencing needs to be obviously very very good okay so this was it uh now here uh this uh simple thing we have actually done now let us go ahead and do some amazing end to- endend project and this time I'm actually going to show you a project which is basically a kind of rack project and how you can use along with Lang chain uh that also I'll probably show you so first of all I will go ahead and uh update my requirement. txt so here I'm going to use Lang chain Nvidia aore endpoints this will be the Lang chain integration which will actually help you to call all the Nvidia um models that it has in N then you will be also importing linore Community F CPU along with this since I'm going to create a streamlet app I'm going to use use this and one more thing that I'm going to use is p PDF okay so all this uh requirements I'm actually going to use it now let me just go the terminal again with respect to this I'm going to go ahead and write pip install minus r requirement. txt okay so the installation will take place uh it is going to take again some time with respect to this let me close this uh folders are visible over here okay so once this is done I will go ahead and start my development okay so here uh let me go ahead and write final app.py okay now final app.py uh this is where I'm going to specifically write all my code U and uh we'll be seeing what all things we are going to use okay now as you know uh whenever we are creating a rag application we will be using some kind of things right let's say PDF or something as such so uh I have some of the PDFs over here okay so this four PDFs I have over here I will copy this okay and let me open this particular folder so we are going to read all the PDFs from here from this US Census so all the PDFs is over here we'll read this and we'll try to create a rag application wherein we will be asking any questions related to this here we are also going to perform embedding okay now when I say we are going to perform embedding so we will be using enia embeddings okay and that is what we I'm going to show you also so quickly uh let's go ahead and import uh streamlit okay so first of all I'm going to import streamlit as St then I'm going to import OS okay okay then from langin Nvidia I'm going to import uh embeddings and chat Nvidia so this is the library that I'm going to import over here so this Nvidia embedding will be again used with the help of apis and chat Nvidia to call any models that are available in Nvidia Nim okay and this is the library that is used in integration with langin then uh you have this web based loader so since uh uh I okay I need to probably read for my P directory so for that I will be using from lanch community. doent loaders since I'm also going to use output pars create stuff document chain and recursive character text plater so I am also going to use this so these are some of the libraries that I'm actually going to use for my entire course uh uh project right where we'll be using this so here you can see the entire uh installation has also been done now let me quickly go ahead and write from EnV import load. EnV so then we are going to initialize load. EnV so that we'll be able to call all my environment variables now uh load uh the API or Nvidia API key because we need to load it so for that I will go ahead and write. environment Environ okay and this will be nothing but but mdore API key and this is what I have actually created in my EnV file right so here you go and then just go ahead and write os. get EnV and here we are going to use nvdia API key okay so my environment variable is ready uh now the next step what we are basically going to do is that we are going to call our LM model so let me just go ahead and write my llm model and this time I'm going to use chat and video right so chat Nvidia and as you all know I'm going to use which model so chat Nvidia is there I guess so chat Nvidia okay chat Nvidia and I'm going to use my model name uh the same model name that I've actually called that is nothing but meta Lama 370 billion parameter okay so this basically becomes my llm model now once this llm model is loaded since I have to read from this particular folder all the PDFs so let me just go ahead and create my one function which is called as Vector embeding because I have to create vectors for all this PDF file right uh here I'm going to specifically use sessions okay so that I'll be able to access it here and there so here uh let me go ahead and write for if vectors if vectors I will create a session which is called as vectors not in St do session uncore State okay and this uh I will create another line now I will go ahead and write st. session session uncore State okay do embedding so first of all I'm going to create my embeddings and I'm going to initialize to Nvidia embedding okay so this is the embedding that we are going to use in order to convert the document into vectors or the text into vectors okay so this is the first thing so here you have St also everything looks fine then uh I'm going to probably Al create st. session uncore State okay. loader okay and this time I'm going to use my Pi PDF let's see Pi PDF directory loader and I will be giving my folder location that is us sensus okay so let me quickly copy this over here so this is my uh another library that I have to use it okay so let's see till here if everything is working fine or not I will just go ahead and open my terminal and let's see if I'm getting any error okay streamlit Run final app.py okay so once this is running I think it should be running and I I don't think so any problem should be there okay so this looks fine yeah it is giving me a blank page uh now it looks fine everything is fine let me quickly go ahead and close this okay now perfect so us sensors I'm actually keeping up in this so basically this P PDF directory loader is going to read this entire PDF files inside this us sensus then now what I'm actually going to do is that uh after reading from this particular loader right variable I'm going to basically write St dot session uncore State sore state DOT docs okay and here uh we going to specifically use store sore State loader. loader so loader. loader will specifically give me all the documents over here and along with this I I'm going to probably do the character text splitting uh because I need to probably do the splitting itself after this so once I get my documents so then what we are going to basically do is that we going to write st. session uncore state. textor splitter recursive character text splitter chunk size I'm going to take it as 700 chunk overlap I'm going to take it as 50 and here you'll be able to see final document is there uh with split documents st. docs and I've taken the top 30 docs okay so we are splitting this particular documents with respect to this so guys step by step you have seen this first of all we created our embedding then we read the entire directory all the PDFs we had it in the loader this loader. Lo load will basically give you the entire documents and then we are taking this particular documents and applying recursive character t Tator where the entire documents will be divided into chunks okay uh here the chunk size is basically taken as 700 with overlap of 50 and finally we get our final document by using the split documents uh and I'm going to take the top 30 records okay so once I probably get the final documents uh at the end of the day I need to convert this into vectors so I'm going to basically write store session State okay dot vectors uh and here I'm going to use f again wait dot vectors is equal to yeah uh we are going to use fires from documents st. session final uncore document and st. session state embeddings so here we have also created our embeddings and the same embeddings will be used over here and finally this Vector will be nothing but it will be our uh Vector database in short the f is Vector database okay so this entirely this function what it is going to do is that it is going to read all the PDFs from that folder it will divide all the documents into chunks and then finally it will convert into a vector and store it in a vector datab so that is what this entire function is basically going to do now uh it's time uh we go ahead and probably write St do title and here I'm going to basically use Nvidia Nim demo okay uh once I probably use this uh as you know that I'm going to probably create my chat prom template so let me go ahead and Define it uh chat prom template I have already imported it over here okay so this chat prom template as I said it is a kind of a rag application so I'll say answer the question based on the provided context only please provide the most accurate response based on the question so here is my context here is my question okay once I probably have this uh now I'll go ahead and create my entire a single text prompt uh text input I'll say enter your question from the documents whatever question you have from the documents you're going to probably ask it over here now I will also C create a button so I'm going to write from St do button okay if St do button and here let's say I'm saying document embedding if this is clicked right if this is clicked I will specifically call my Vector embedding function okay and once I have my Vector embeding function I will go ahead and update this and I'll say hey my Vector store DB is ready okay and this time the vector store that we are using is uh F right F Vector store DB is ready and by using which embedding technique is ready using Nvidia embeddings right so this embeddings we have specifically used over here right so what happens is that as soon as I probably click this button it is going to Calla all this entire Vector embedding and it is going to make sure that you have this Vector store DB ready okay and then finally I will go ahead and uh so once this is done I will go ahead and write if prompt one if the prompt one is there if I am probably searching for anything and if I press enter okay then the next thing that I'm actually going to do is that create my document chain and for this I'm going to use my create stuff document chain here I have to give my llm comma prompt right so this we have already seen in our Lang chain playlist itself right so here you have this and then finally you will be able to see retriever is equal to St do session state do vectors as retrievers so this vectors DB when we use with as retriever this basically becomes an interface to retrieve all the data from here right so this basically becomes my Retriever and since this Vector is basically stored in the session State we are going to basically use this session State itself then after taking this entire retriever what we are basically going to do we going to create our retrieval chain okay and here we are going to use uh okay create retrieval chain and here we going to use my retriever comma document chain okay so this two things I'm going to basically use it and close it over here since already my llm is basically called now this create retrieval chain will basically create my retrieval uncore chain which I'm going to use it over here but at the end of the day this model that is basically getting called it is from Nim right so Nvidia name Nvidia name inferencing right inferencing perfect so here it is uh and uh after you get this retrieval chain all I have to probably do is that I will start the time okay so import time let's go ahead and import time over here because I'm going to use the time functionality so that I will be also able to measure and see how fast this is start time uh then we are invoking based on the prompt that we have and we finally get the response and we are writing the entire response okay now um along with this uh this meta uh open source model right Lama 3 also provides you some context information so below by using again streamlet we will try to display the entire we'll try to display the entire uh context also so here I'm going to write from with xt. exander document similarity search and here we are going to basically display the context now let's go ahead and run this and I think it should probably work and we are going to basically use this chart and video itself now let's quickly open the terminal so guys now let's go ahead and run this code and see that whether everything is working fine so here I'm going to basically write python final app.py oh sorry I have to basically run with streamlet so streamlet run of final app.py okay anyhow I'll be giving you the entire code in the description of this particular video go ahead and check it out and definitely go ahead and use Nvidia name now the first thing over here is that I'll go ahead and click on document embedding so as soon as I probably click this what it is going to do from that entire folder it is going to take out all the documents convert that into vectors store it in the vector database there we have used Nvidia embeddings so we will just wait till the vector DB is ready and then we will go ahead and ask any question that we want like a rag application so still it is taking time there are many files four to five files obviously there are a lot of documents and again it is basically happening it completely with the Nvidia inferencing so let's wait for some time so here you can see the vector store TB is ready let me quickly go over here and see whether I'll be able to see my Vector DB okay so now the next thing what I'm actually going to do over here is that go ahead and ask any question that I want right so now over here I'm going to basically say differences in let's let's go ahead and ask this question what is the differences in the uninsured rate by state in 2022 so here I'm going to put this question what is the differences in the uninsured rate by state in 2022 so it obviously needs to pick up all this information and probably give you us a answer so here according to the context the difference in the ins short state are so here you can see a low of 2.4% and 16.6% compared to the national uh 8.0% and from the help of document similarity search you can see what all context it is basically taken and has given you the results and this entire thing is basically happening with Nvidia Nim right all the models that are probably there from Nvidia embeddings to open source model and here we have specifically using Lama 3 itself so go ahead and check it out and definitely uh just go ahead and explore more models that you want based on the different different use cases like reasoning you have something different visual design you have something different retrieval speech if you have if you want to probably check it out you can go ahead and check it out and here you also have lot of Open Source models uh Foundation models so it is up to you go ahead and explore this so I feel uh Nvidia name Nvidia has done an amazing work and definitely this solves a lot of problems so yes this was it for my side I will see you all in the next video have a great de and all the information regarding this will be given in the description of this particular video so thank you I'll see you in the next video bye bye

Info

Channel: Krish Naik

Views: 12,059

Rating: undefined out of 5

Keywords: yt:cc=on, nvidia nim tutorials, rag document Q&A with Nvidia NIM, nvidia nim models, nivdia nim llama3 models

Id: 46FIOSYqruE

Channel Id: undefined

Length: 26min 29sec (1589 seconds)

Published: Tue Jun 04 2024