Mastering LlamaIndex : Create, Save & Load Indexes, Customize LLMs, Prompts & Embeddings | Code

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey hi everyone in today's video we're going to cover you know llama index where I guess people call it llama index you know it is one of the library that you can you know uh use to connect your own private proprietary data to the let's say large language model or I would say you know to connect large language model to your own I know proprietary data so we have seen many videos with respect to let's say Lang chain and then we have also seen the you know pine cone the other Vector DB chroma DB but we never created video on let's say the Lama Index right so I thought of creating an end-to-end uh you know some introductory video on the Llama index we will see what we're going to cover here right so we do use it you know like I have used it for a couple of uh you know projects and we are still exploring this particular framework and I know the one of the co-founder personally the Llama Index right so see what we're going to do here so first thing you know we're going to see you know uh how do we create the Llama index and how do we query it right so how do we make sure that you know let's say um you use some kind of a PDF document and then you know you uh build an index you know where uh you can ask some questions with respect to that you know documents then we talk about you know how do we you know save this index load it somewhere else let's you train it here and you want to use it on your server ec2 machine right so you need to know how to save your index and then load it again right the other thing is that how do we customize the llm right so by default you know it uses open AI gpt3 model right take stab in c003 model right what if you want to use GPD 3.5 what if you want to use gpt4 so we're going to see how do we customize the llmr right the other thing is how do we customize the prompt since you know the basic use case is actually you know you can get answers from the document but what if you want to change some prompt you want to give some different instruction so that's what we will see the customizing The Prompt right and then we will also see you know how do we customize the embedding what if I want to use hugging face embedding the open source embedding right so those are the couple of things that you always came across you know when you are building some real projects so that's what we're going to see we're going to see each of this particular point right so let's start with you know uh the first thing right uh you know what document we're going to use right so I used to use the earlier videos you know some sample documents you know I think last video I used some page related text file right but then now I have some snippet of a book the book is called you know here uh it's the almanac of Novel Ravi Khan so Naval Ravi Khan is you know kind of one of the uh you know famous person on Twitter he shares his philosophy about you know wealth happiness and all those kinds of things I think you know I'm big you know fan of him and maybe I became a freelancer right or a self-employed is somehow you know influenced by him actually that's I would say so what I'm going to take actually so this book is like more than 200 pages but I'm going to pick up it's you know first part which is my favorite part which is the you know wealth or Building Wealth or the wealth creation which is seems to be like a 90 kind of pages right so that's what I did so I only pick you know took the first 90 pages of this particular you know PDF and then we will ask some questions uh and see how it you know answers right so let's uh the first thing we want definitely the Llama index we can in you know install it using let's say pip install we also require a pi PDF which is the dependency of the Llama index because we want to you know uh this you know crunch I would say crunch let's say we want to chunk this particular you know PDF file the other thing is that as I told you that we can customize embedding and I'm going to show how do we use hugging face embedding or a sentence Transformer embedding right for that purpose we need this particular thing right so let's run this thing and let's install this all information as it is mentioned here so I give you a link here right which is the installation of the setup link which talks about you know uh how do we install all of these things and then here it says that you know it is using a zero zero three which is gpt3 model by default right so this is where you can find it I have put a link here now it is installed so the since it is going to use you know open a model we want to make sure we have you know open AI key as a part of our environment variable or you can simply import openai and then assign as an open AI API key just like what we do right making uh whenever we use open AI to make any call right next thing let's create the index the basic example is very very simple actually you know creating index is very simple so from the LMA index we require two things one is a vector store index so it is one of the index that you know one of the kind of index that you can create so in the Lang chain we saw you know there is only one type of index which is actually this Vector store Index right where you take your you know document chunk it and then store it in some Vector store actually and then you can retrieve these documents and pass to the llm file you know generating or answer right so we require we're going to do the vector store index we will see how Vector store index looks because they have some good visual documentation the other thing is you know directory reader or simple directory reader so we can give this the directory address and then it can it can read all the documents so the different kinds of documents what we have right so if you want to see what exactly the you know Vector store index I put a you know this document link here which talks about how different indexes work it's their own document and good visual you know representation so this is how Vector store index looks like so you can think of this is a vector store here each node you can think of each chunk so you have a big PDF you might chunk into let's say 4000 characters of each or two thousand characters of each right so each node is actually one chunk and that chunk you know what you're storing your storing Vector so you pass that particular text or a chunk through the embedding model which is let's say either it is an open AI embedding or it could be a sentence Transformer embedding right and then it gets story at this particular embedding then what happened during the query time you know when you ask any query it goes through the same embedding so that you get the query embedding and then you check among all of these nodes or the chunks you have in your vector store let's say you want to select the top two chunk which you know matching with the query embedding because those are also embedding right and then you select those top two thing and give it to the let's say to generate this response synthesis which is where you know let's say you're going to take this user query these two chunk you got from the vector store and pass it to let's say llm like you know gpt3 or GPT 3.5 turbo and then generate the answer so this is what typically we have even used in the let's say the line chain but the good thing is that this has lot more to offer you know the Llama index has a lot more to offer you can see that there is something called list index then we have this one we just saw then we have tree index and then you can see actually maybe the next video I will create where we talk about all those different kinds of indexes and then maybe when we should use those you know indexes but for now uh the purpose of this tutorial is to actually show you exactly you know how let's say the Allama index works right because the steps are almost same for no matter what index usually we're going to use so let's go back to our thing so what we want to do I should have done this thing so that it would have created the index by this time so what I'm giving here so I'm giving this book you know directory here now what we got okay something issue with the open AI we had it right I think it takes time or what let me try again so we're going to uh you know load this simple directory reader where we're going to give our directory address which is book here and then it's going to load the data we're going to get the list of documents actually whatever the number of documents we have and then we can create this Vector store index from the documents we pass our documents and eventually we got the index and then that index can be used as a query engine so that you can ask some questions or the query it right let's drag in and see whether oh we again see this opening acting right did you but we do have key right yes we do have we ran it what if I on this also let's see if it works after this okay seems to be it is running that stage where we are actually creating a vector store or like sorry we are creating the index you know from the documents that we just got from the director reader maybe let's let it okay it's already good zoom right okay and then once we got the index then we're going to use index as a query engine so that we can query okay that's it we created the index now let's ask some questions now I'm asking you know what is this text about just like you know not searching anything answer into it particularly but just like an ah you know asking some overall or kind of a meta question right so I'm let's see what what a response we get we are asking what is this text you know about let's say what it says okay this takes is about dangerous of lusting after money and how it can occupy one's mind and lead to paronya and fear it also discusses how navel Ravi Khan combine his vocation and allocation to make money in a way that felt like a play okay yeah maybe that could be you know I didn't like the starting part anyway now let's see what is who is this text about because we will be having name right the text is about Naval ravikanth right so the good thing is that it this I found the Llama index particularly better than maybe the whatever the other option we have is that it is very good to answer this kind of question the overall questions or you know kind of a meta questions on top of the question and not you're searching for a particular thing actually now let's say we're asking when so it is saying 2020 let's see do we I think I have seen it it is correct seems to be yeah it seems to be this 2020 it could be the same thing what it is answering okay so this is how right just within how many lines like five six lines of code actually and then you literally have a you know some kind of a program where you can ask some questions with respect to some document now here it is a book it could be anything it could be some legal document Constitution or let's say IPC Penal Code document it could be anything that you can ask some question against it right so this is how you can create the index and then you can actually query Index right very pretty simply right now let's again one more overall question so we want to list five important points from this particular book okay it says you know understand how wealth is created find and build specific knowledge play long term games with the long term people take on accountability and bid build or buy equity in business this is pretty good actually because this is what my understanding of you know this particular book is so it is really good actually you know uh then let's see next what we are saying what novels is about wealth creation okay because we had only a topic only now let's say what he says about it okay Naval Ravi Khan says the wealth creation is a possible through ethical means He suggests seeking wealth instead of money or status and suggest that one should own equity in business in order to gain a Financial Freedom he also suggests that one should give Society what it wants but does not yet know how to get it okay this is pretty good summary and you know what he says actually whatever the nawal says right so uh This is how we do now next step what we want to do is actually save and the load the index what if now you want to take this particular index and deploy it on let's say Amazon ec2 VM or somebody else right you need to know how to save so we can take the index and then we can call a method called storage context dot persist so it is going to persist we need to give the folder name where you want to persist this index so that you can load it right so once it process we should see how we saw this there is a new folder came called novel index and it has a lot of you know index store Vector store graph store whatever all this information now how do we load this thing now we saved you know persist it using a storage context right so from the Llama index we can import a storage context and then there is a function or model called load index from that storage so we first initialize this storage context okay there is a function called from default and just we give where is the directory where we have saved our index and then we can pass that context to this method which is load index from storage and we should get the new index and that new index we should be able to the same way the way we have used it right we use that index as a query engine and start asking the question with this new query engine and you could see it is able to answer that the text is about Naval Ravi Khan so this is how you could you know save and the load Index right now what about the customizing M the llm right we don't want to use let's say 0 0 3 gpt03 or gpt3 we want to use let's say different model right to do that you know we could use something called the service context seems to be like something that you can pass some configuration right in the LMI index so we want two things so from the Llama we want llm predictor which is like llm that we're going to use then the service context but if you notice the Llama index or I think it's only lava index okay uh it doesn't have its own llm you know model wrapper or anything so it is underlying it is using the Lang chain models right and again Lang chain is using let's say some other provider model like an open air so from Lang chain we are more concerned about chat models because I want GPT 3.5 turbo so we import the chat open here then this llm predictor takes one of the input as a Lang chain model instance which is we just created this chart openai right which can take the model name so we are actually passing this llm predicted class this GPD 3.5 we got the llm predictor right and then we create the service context with that llm predictor now when you create the index right from the document you can pass that service context which has a configuration details about what llm to use right let's create so now we are actually creating a new index called the custom llm index okay I think that index got created now we can ask this custom llm index same question the way you know we have asked right so we can simply call it right so I'm not going to ask much other questions you know you can try on your own document ask something I just wanted to show you how to do that thing you know how do we uh you know import that llm predicted class then we can pass a lang chin model instance to it and then eventually that llm predictor can be you know uh used with the service context and passed to the uh you know while creating the index from the document right now let's check the custom prompt what about you want to modify the prompt whatever they are using there will be some prompt right since it is able to answer the question based on the document that you have provided there will be a simple prompt athletic side that hey based on this context try to answer this particular query as truthfully as possible this is what you know generic prompt most of the time we use let's say we want to customize this prompt with your own instruction right here I could give a explicit instruction saying that hey this whole text is all about actually novel Ravi Khan what it does and what this book about so that it will help you know model even answer better way actually right so let's see how we can do so from lava index we import a prompt module and then we can create some templates similar to The Lang chain which is nothing but a up string kind of thing right you have a big string and then you have some placeholder there's a placeholder for context and there is you know what extra thing I have added only in this prompt I said this hey start with every answer you should start with the code word called let's say the AI demos so AI demos is actually my you know one of my site project recently I started sharing on you know LinkedIn also uh so you can go and check actually uh AI demos so this is actually the website maybe you can mostly you can go and check the uh you know the YouTube also so this actually coming from so we have you know many videos on this channel recently we created and we upload every day two videos where we you know create a video demo let's say a recent AI tool so you could see every day we have actually two videos coming here and I guess we have one or two shots also every day so you can go and you know explore this thing and see what is like to you know use that particular tool and if you find it interesting then you can go and check that particular you know uh tool so that's what we have like an AI demos okay let me go back here so I want every answer to start with the codeword let's say aitmos let me run this thing you know so we declare the you know now this custom llm index what we had right there we can pass our text QA template the question answering template as our template what we want right so the variable name is actually a text queer template right and then we get the whatever the query engine we get let's see how we get the response now okay you could see right while it is answering it is uh using the starting word which is AI demos because we instructed it to do this thing right so this is how you can put your custom instruction in the prompt your custom prompt right and you can uh you know uh use it so this is what instructions we have given to it so after the customizing prompt what next we want to do we want to have our own custom embedding so I think default it is using open AI embedding but let's say now uh we want to use let's say the our own open source embedding right again just like uh you know if you look at when we use the custom model that model was actually the Lang chain model instance right similarly when you're going to use the custom embedding it is actually the Lang chain embedding instance which is hugging phase embedding to have our open source actually embedding right and then we have a wrapper for it so from LM index we have a line chain embedding wrapper and then again the service context I think whenever you want to give some you know different things other than the defaults and pass some configuration that's where we actually use this service context I think I had given link here that you can go and read more about what is service context right so once let's import actually hugging face you know embedding and this uh whatever the wrapper classes we required from your model requirement llama index then we create the instance of the luncheon embedding where we can pass the you know hugging phase embedding model instance and then we can use this embedding you know uh pass by passing to the service context so we create the service context object by saying that we want to use this particular embed model and then while creating the index similarly we're going to pass that same context now the whatever the index we got is actually using the what a this model the sentence Transformer embedding let's create it again building the in uh I think it's yeah it started downloading the model that we want to use actually okay now it started actually creating the index from the documents what we have using this particular uh embedding model maybe it takes some few seconds yeah it finishes okay let me clear this output So eventually we got the new index using hugging face embedding now let's use that new index and ask this question okay you might get bored I'm asking the same question again and again maybe enough of this question let's say this question how we get which is like list five important points from this book we have seen earlier with the open a embedding so we got this Building Wealth and being happy as skills that can be learned is a collection of novels wisdom and experience from the last 10 years this book provides insight into novels principle for Building Wealth and this thing right this book okay so this five important points from this book if you look at this it is answering some kind of you know what this book contains but it is not really actually summarizing points for us which is uh I think the other was good where I saw you know okay I printed it somewhere right where Did We Print yeah here understand how wealth is created find and build specific knowledge play long term this is like really good means this is what actually he says right these are the key points right this five points are more about you know what is in this book and overall like more of like a generic information this is again I think uh free to download this you see these are simply uh you know something which I don't think it's uh you know that good is it what if I re-run a game and let's see okay uh seems to be the same answer right uh so I would uh prefer the other one actually now the what is the last what novel says about the wealth creation okay says that wealth creation is not one time thing but skill that needs to be learned He suggests asking yourself what you are doing is authentic to you and all of these things okay okay so this is how you know uh you could use I think uh in this tutorial we covered pretty good things you know that is what you need actually if you want to build some you know uh project or a product based on the Ella minding site what is that is all about you know creating and the querying the index how do we save and load it how do we customize llm prompt and embedding right and all of these things we have seen there is much actually right for example when we created the documents right what's happening under the hood how do I control the chunking you know we saw the node but we never came across a node in this particular code right so these are some more uh you know uh underlying things that we can learn later actually you know as I told you there are even multiple indexes there is even whole section about you know how do we synthesize the answer actually this part you know so all interesting things are there but this is is pretty good actually to start with maybe I will cover another one or two videos uh you know covering the other topic mostly the other indexes the response synthesis and maybe something about you know how do we control the chunking and all of these things right I hope you found this video useful let me know if you have any doubts right I will be sharing code on you know it will be linked in the description of the video and as I suggest please go and visit you know AI demos subscribe it you will find some interesting tools that you know if you like you can go and explore thank you
Info
Channel: Pradip Nichite
Views: 10,374
Rating: undefined out of 5
Keywords: llamaindex, langchain, llm, openai, piencone
Id: XGBQ_f-Yy48
Channel Id: undefined
Length: 23min 56sec (1436 seconds)
Published: Wed Jul 05 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.