LlamaIndex overview & use cases | LangChain integration

Video Statistics and Information

Video

Captions Word Cloud

Captions

llama index is a great large language model framework to help you build applications by providing tools that facilitate document indexing retrieval and more there are some functionalities that I find incredibly useful it allows you to ingest from different data sources and data formats enable document operations such as inserting deleting updating and refreshing the document index supporting synthesis over multiple documents use the router to pick between different query engines allow for various document embeddings including the hypothetical document embeddings to enhance output quality support the brand new openai function calling API offer a wide range of Integrations with various Vector stores jgbt plugins tracing tools line chain and more okay so let's explore some of those functionalities in this video I'm going to go through this notebook together with you I will link the notebook in the description below so if you like to or follow along please feel free to open up the notebook and run it yourself first up it is important to be able to load external document in order to interact with your large language models llama index provided data connectors or Lama hub for us to do this easily here's an example where we imported a Wikipedia reader from llama hub and then we're able to specify the Wikipedia Pages we would like to read and then we can save those pages into this documents object just to show you what llama Hub looks like there are more than 100 different data loaders and if you click on each one of them it will show you exactly how to load it will provide you consistent syntax to load whatever data you would like to load so we have for example Discord if you want to load Discord data we have Google Docs GitHub repo all different kinds of data sources and one thing I would like to point out is that it also supports multi-model documents for example you can use this image reader to loading text from image if it has plain text it uses this model and it has the key value pair text like an invoice it will use a donut Transformer to extract the text from the invoice image I thought this is pretty cool so that's the data Hub to load data consistently before we jump into some of the cool use cases I want to quickly go through the basic query functionalities in order to ask a question about your document you basically only need this three lines of code you need to build an index over the documents objects the documents objects is the Wikipedia pages that we just defined and then you can query and index with the default query engine the retriever query engine and then we can ask the question about the document and it will provide a response so this is the high level API you can also use the low level API to help you configure how you would like to retrieve the information and how you would like to synthesize the response there are different options for retrievers and also different options for response synthesis that you can explore so I'm not going to go through each of them in this video because I want to show you some of the cool use cases with the line decks one of my favorite llama index feature is the document management which allows for inserting deleting update and refresh operations so once we have created an index for the document you might need to periodically update your index to allow for new data coming in and sometimes this process can be costly if you are using an OPI embedding options this will cost you money for document embeddings since you have already spent the money for your previous document you probably don't want to spend that money again llama index allow you to update and refresh your document index without the need to redo the whole process all over again in this notebook there is a complete example of this kind of use case where we have two Discord data set dumping from two timestamp when the new information comes in rather than rebuilding the entire index from scratch we can index only the new documents using this refresh function as you can see it's just three lines of code you can refresh your index you don't need to rebuild the index for the previous document with llama index it's super easy to query multiple documents here is an example where we have three PDFs of uber financial data from March 202 June 2022 and September 202. it's the quarter ending in that month we first need to load the documents and then we create index for the documents we build query engines for the documents we have seen this before what you have not seen is this query Engine 2 method which allows us to define the metadata for each query engine so here we can give it a name September 212 and provide a description for this query engine for example this one is provides information about Uber quarterly financials ending September 292. this is important because this basically tells our language model what this query engine is and based on the description of each query engine our language model is able to use the correct document for the questions and here we're using a special query engine called the sub question query engine which allows us to query multiple documents the square engine will generate a query plant containing sub questions against subdocuments before synthesizing the final answer so if we ask the question analyze Uber Revenue growth over the latest two quarter filings it will generate two sub questions for us the first question is what is the revenue growth of uber for the quarter ending in September 2002 and the second question is was the revenue growth of uber for the quarter ending June 292 the language model will answer those two questions separately and the final answer is based on the answers for those two questions another super cool query engine is the router query engine you can define a custom router query engine that can route to different databases in this example we have a SQL database and a vector database based on your question the router query engine can route different questions to different databases in this example we first created a SQL database using SQL Alchemy and then we added three rows of data here with the population information for or three cities Toronto Tokyo and Berlin and for the vector database we used Wikipedia Pages for the three CJs again as well we have seen before we're always starting by building index so we built the SQL index based on the SQL database and we built a list of three Vector in indices from the three wikipedia pages and then in the query engine tool we can give the description for the SQL query engine it's useful for translating a natural language query into a SQL query over a table containing City stats containing the population country of each City and then for each of the vector database query engines we have the description it is useful for answering semantic questions about that specific City and then we can Define our router query engine including all of those tools into the query engine tools parameter okay now if we ask the question what city has the highest population our Roger query engine was able to Route this question into the SQL database generate this SQL syntax and execute this sequencing text and then get the response that Tokyo has the highest population with almost 14 million people now if we ask another question tell me about the historical museums in Berlin our router query engine now is directing this question to the vector database to search for the Wikipedia pages and get the response here now you can see this the response returned so when we ask questions about external document what we normally do is that we create text embeddings for both our questions and our documents and we find the most relevant chunks in our document that's relevant to the question and we use that relevant text Chunk to answer the question however the answer to the question might not be as similar to The question as you might think what if we could generate a hypothetical answer to the question first and we use this hypothetical answer Vector to find the most relevant text chunks most similar to this hypothetical answer Vector so that's what this hypothetical Dogma embeddings is about so in this example we used the Hyde query transform to generate a hypothetical document and use it for embedding lookup and then we used the transform Curry engine to convert the original query to the hypothetical answer query from this example that I got from the document it was able to improve the output qualities originally and this might not work every time you might want to use it with caution to see if it actually works for your use case so there are different types of transform Curry engines that you can check on this page again query Transformations are modules that will convert a query to another query they can be a single step it can also be a multi-step process and there are several use cases in this document that I thought is is pretty cool to look at and this hypothetical document embeddings is just one of them there are some other use cases I encourage you to take a look so in this final section I would like to talk about how to use llama index with line chain please check out my previous 10 minute tutorial online chain I have made several other videos on Lane chain as well I like line chain a lot if you're familiar with line chain you might be wondering so what are the differences between llama index and line chain and why should I be using llama index or why should I be using line chain great question in my opinion the line chain has a much broader use case is focusing on chains and agents and integrates with everything llama index has a different focus it has a much more narrow Focus going deep into the indexing retrieval query engine functionalities but let my index and Lane chain are not mutually exclusive actually I have seen a lot of applications using both they use both llama index with line chain so I would like to show you two use cases on how you can combine them together in this first example we can use llama index as a callable tool here is a simple example we load a document we created an index based on the document instead of using Lama index to query this document we want to use LINE chain and we can use the line chain agent 2 method to wrap the Llama index into a line chain tool and then when we initialize the line chain agent we can pass in the tools we just defined which is the Llama index as the tool the second example is to use normal index as a memory module line chain has multiple ways to deal with memory also Lam index is the same so here we use GPT index champ memory to keep the chair memory using llama index again when we initialize the agent we Define the memory as this memory we got from llama index when you ask the question hi and Bob what's my name the language model was able to see the chat history and figure out your name is Bob so yeah so you can totally combine different functionalities of llama index with line chain if you like that's it for this video let me know if you use llama index or line chain and which tool do you prefer if you ever use them together or use them separately I would love to hear from you hope you find this video helpful see you next time bye

Info

Channel: Sophia Yang

Views: 10,426

Rating: undefined out of 5

Keywords:

Id: cNMYeW2mpBs

Channel Id: undefined

Length: 12min 36sec (756 seconds)

Published: Mon Jun 19 2023