Chat with Your Documents using Llama-Index and Google PaLM 2 (Free LLMs and Embeddings)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

in this video we will learn how to create a chatbot to chat with our own documents using llama index Lama index which is an alternative to line chain Nama index gives us the ability to build powerful applications based on large language models these applications include document q and a data augmented chat boards knowledge agents and structured analysis we can also connect with different data sources which include unstructured data structured data and semi-structured data as well using gamma index we can quickly build llm based applications just like Lang chain so let's get started so in this tutorial we will be creating a chat board to chat with multiple documents using gamma index and we will be using Google Palm tool large language model which is a free open source model and it's not like closed Source model like open a which has a cost Associated but Google Palm 2 model is a free open source model and we will use assessing this model through the API key so let's get started let me show you the architecture how we can create a chat for to chat with multiple documents well here is the complete architecture in the step number one the user uploads all the documents which include PDF files dot text files docs files in the step number two we extract all the data that we have in the PDF text or docs files then uh you all know that uh the Google com 2 or any other large language model which include uh which uh including open source or closed Source model has an input token limit as we are using Google Form 2 model in this tutorial so Google Palm 2 model has an input token limit of 8096 token and one token is equal to four English characters so it becomes a 32 000 English characters so Google com 2 cannot uh take more than 32 000 English collectors at the input so this is the uh input limit like 8096 token and it is almost 32 000 uh English characters so when we extract data from multiple documents which include PDF text files so this data can be backstructed data can be more than 32 000 English characters so we cannot pass and character directly to our Google Form 2 model we need to split the data into small chunks so what we do in Step number three we split all the extracted data in small number of chunks and we can Define that any Chunk we have maximum 500 English character or 800 English characters we can Define this and after creating text small text Chunk or after dividing the data into small text chunks then we create embeddings for each of the text chunks so embeddings are basically used to compress the size of the text Chunk embeddings are basically vectors that contain floating Point numbers like you can see over here 0.1 0.3 so these are uh this is how embeddings look like some weddings are basically vectors that contain 14 point numbers so that we now we create embeddings for each of the text Chunk if we have 100 texture and we will create 100 embeddings if you have 10 text Chunk we will create 10 embeddings as well so we create embedding for a job the text Chunk then we basically save all those embeddings uh in the knowledge base so build a semantic syntax and then we save all those embeddings into a knowledge base or a vector database okay so this is the step number four so now after we have saved all the embeddings in the knowledge base so when the user asks question we create embeddings for that question and then we try to find answers for that question from our Vector database so we can rank four to five top answers for that question so we have the option like how many answers you want to rank for that question so we just rank four to five top answers for that question and after ranking are those answers we pass that question directly to the our Google Form 2 model as well and we pass that rank answers to the Google pump 2 model as well and Google Palm to model gives us that natural response at the end so this is how it looks like so this is the complete architecture overview and let's uh move towards the implementation part but before I move towards the implementation part uh let me show you how you can get your Google Palm API key so you will just write Google Palm API to apa over here and then you just need to click on first link from here okay so Google Palm API key is available under maker suit so you just need to go to maker suit from here okay and you need to request for that for the for the SS of the API key I have already I've been granted access to the API key uh it takes five to six hours after this you will be uh you will get access to the API key so I will just create a new API key you can create multiple API keys I have already created one so I will just copy this and I will be using this in my tutorial okay so let's move towards the Google Wallet part but now over here you can see that that so over here you can see that notebook we will be doing chat with multiple documents using a llama index so I'm just connecting so before running Escape you can please make sure uh that you have selected uh CPU or GPU so I will be using CPU you can also use a GPU as well so it's your choice okay because we will be assessing the large numbered model through API key so we don't need the GPU we can easily use CPU as well so in the step number one I will install all the required packages so which include lava index so Alarma index is just a framework that allows us to build large language models build applications powered by large language model and to extract the data from PDF files by the pi PDF package and to use Google pump uh or to assess the Palm to model to API keyword by Google generate package and to download embeddings from hacking face requests Transformers package and to store the embeddings or to store the to save some weddings we can use number CP file on package you can grow this in this case we are not doing this currently okay so all the packages are getting installed so let's see this will take few more seconds so after this packages get installed I will import all the required Library so simple directory loader is to lower all that data uh in which you know which we have in our directory which include PDF files dot text files or doc files then we have Vector store index so that we can store our embeddings and we are using Google Power model as our large language model so we are reporting palm and to display the output or response we require this display library then we have service context uh and so storage contents so these two libraries are used to save the embeddings and then we have load index so that we can load the embeddings which we have saved and the import OS so that we can pass our Google uh Google Palm API key as well okay so these packages are getting installed so this might take few more seconds well now you can see that all the packages are installed that's import the required libraries over here so now I will be just creating a directory over here in the file set folder by the name data so mkd I will create a directory over here now you can see that our directory is created but it's an empty directory so what I will do is I will just upload some PDF files and some text files over here so I will just go to over here and just I will upload these three files over here so now you can see that I'm just uploading a sample resume and then I have the state of the union.txt file so so here we have the US president speech complete speech is given over here and uh this is the uw7 paper this is the yellow V7 is basically object detection model and this is the operation paper of eulogy seven okay so now what we're doing is that using simple directory reader I'm just extracting all the data which we have in this directory data directory and we will just select all the data which we have in this PDF file.txt file and Dot PDF file and then I will just save this data into my documents variable so I'm just extracting and loading the data which we have in this PDF and Dot txt file so let me show you all this data so now you can see that we have extracted all the data that we have in those two PDFs files and that dot txt file so you can see this is a huge amount of data which we have in our PDF dot txtin files okay so quite a huge amount of data okay so now what we can do is further I will just create an environment over here and just pass my Google Farm API key because we are using pump to model as our large language model right here okay so now I'm just configuring the service context okay used now previously I was just getting error so you just need to select the runtime as GPU so this will sort out the error so uh Google app offers a free GPU but it's for unlimited user so that we can use that free review so next what we do is uh now as you know that we have extracted a data and it's in that documents variable on the data that we have in our data directory which includes the we have extracted data from the PDF file.txt file and this PDF file dot PDF file as well so now what we will do is we will split that data into small chunks and each chunk will have maximum 800 English directors and there will be an overlap of 20 English characters so I have here I'm just passing my theology language model which is we are using Google Power model so let's run this and over here uh hoping phase embeddings will be uh used by default so now you can see that using embeddings with model name so this is the embedding model which is being used and the embeddings are being downloaded over here okay so this will take few seconds so now what we are doing is that uh here we have the documents so now we have splitted our data into small number of chunks next what we do is uh basically we create embeddings for each of the text Chunk so now we are just creating embeddings for each of the text sound over here so on so if you just want to store the embeddings as well so you can uh just check these embeddings so now you can see over here we have the storage folder and you can see all those in the Json format so if you just open this over here so now you can see the embedding dictionary over here so these are all the floating Point number and over here so you can just change the details all those details over here okay so you can find all those students all these details over here so you can just download the some weddings and if you just want to uh instead of creating the some weddings again if you just want to load the embeddings you can just uncomment this code and run this so now let's do a question answer with our documents on you can see that we are not doing we will not doing chat with our documents so this is my first question uh Rachel what is Rachel Green qualification which is from this PDF file okay and now over here I will just get the response so result green has a PHD in English from the industry of otherness at Urban art campaign so next question which I will ask is uh yellow V7 is spraying on which data set so it's Ms poker data set okay so that's correct you're gonna be seven oh performance page models so this is another question which I've asked over here you're gonna be seven outperforms the following model R Euro X you still do before the ETR so and many other object detectors in speed and accuracy so you can see the response and let's ask question from this dot EXT foreign or let's ask some other question but let's ask about your not sorry it is proposed by the authors in the paper you'll be seven yet another object detector so that's all good from this tutorial in this tutorial we have seen that how we can chat with different documents which include PDF file dot pxt file you can just upload dot docs file as well and other presentation slides as well and we have seen that how we can chat with different documents using llama index and Google Form 2 model so that's all from this tutorial thank you for watching

Info

Channel: Muhammad Moin

Views: 3,884

Rating: undefined out of 5

Keywords: LlamaIndex, Llama-Index, chatgpt, chatbot, chat with documents, chat with multiple documents, Google PaLM 2, PaLM, palm api, RAG

Id: aWGn61YT2dA

Channel Id: undefined

Length: 14min 59sec (899 seconds)

Published: Mon Sep 25 2023