ChatPDF: Chat with PDFs using this Python program!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I was relaxing on the porch of my dream house all of a sudden a thought strikes me the idea is that I need to talk to my documents but how how does someone talk to documents of course using AI I ran to my computer to write a program that would help me talk to my documents after long long Seconds of AG chain and lamb talk to my documents yes I have an openai account so I will use that account's API key to talk to my documents talking to documents is not like talking to a wall and it is not like talking to a house it is neither like talking to a tomato plant talking to documents is not free you need to have the free initial account at least or after that you need to enter your credit card info I imported packages I needed I set the necessary environment variables one of which is my credit card sucking open AI API key I need to pass the documents I have in the data folder and create chunks of text out of the data because the calls to the GPT model do not like long documents then I need to create embeddings for the chunks I need indexing to keep track of the chunks and the generated embeddings the embeddings will help me find the most contextually relevant chunks to a prompt I'm using the text DaVinci 003 model which belongs to the gpt3 model family I love talking to my documents but I want to avoid creating these embeddings again and again when I need to talk to the same documents why do I not want to create embeddings every time I load the program for the same set of documents duh I do not have plenty of money and creating embeddings for chunks of documents using openai requires money therefore I want to store these embeddings and the chunks of my text in a folder say a folder named chunk storage so that whenever I need to talk to my documents I can load the chunks and the embeddings and speak to my documents I ran the program and I found that the chunk storage folder is created I am a happy man who lost some money in creating the chunk storage losing some money to open AI cannot break me yet because I am motivated and willing to spend some money to well to talk to my documents so far if you liked the video then please press the like button if you do not like this video so far as well what can I say then hit the Subscribe button to subscribe anyway back to the story still the question of how will I talk to my documents remained a mystery I am writing another program that will load the chunks of text and embeddings from the chunk storage then the program will help me communicate with openai based on my questions it will take my question and find the chunks of relevant text to my inquiry it will then send openai the relevant chunks and my questions so that openai can answer my question based on the relevant chunks openai will charge the account based on the number of tokens in the chunks and questions sent and the answer size so I only ask questions if I need an answer I do not ask questions just to socialize with my documents I made sure to write the questions and the responses in a text file with a file name containing the timestamp when I started the conversation with my documents that is it running through the two programs again the first program is to model the texts of the documents using open ai's embedding capability the second program is to enjoy talking to your documents you run the first program just once to create your chunk storage from your documents once your chunk storage is ready you run the second program whenever you need to talk to the same documents in your data folder the second program will actually send parts of text to open AI relevant to the question or the prompt openai will provide its educated answer to the question based on the chunks sent to it for new documents you need to run the first program again to model your documents in the data folder what a clever idea with this idea you can automate questions answers to help your clients based on an existing knowledge base an example I drop a document in the data folder it is a research paper I could drop many papers in this folder but I love keeping my money in my pocket and giving less to open AI so I just dropped one paper in this folder documents can be PDF files text files Etc after I run the first program the chunk storage folder is created where I have all the chunks from the paper corresponding embeddings and the indexing now I run the second program to talk to my document I ask what is the paper about it gives a correct answer how do I know this answer is correct because I am one of the authors of this paper I ask what data sets are used in this paper the answer is correct I asked a few more questions the responses are amazing talking to documents is no more a dream I will provide a link containing the code in the description section below if you'd like to see more videos like this write a comment stating more please if you do not want to see any more videos like this write a comment stating enough is enough either way I will see you soon for now I will go back to my dream house and relax
Info
Channel: Computing For All
Views: 1,588
Rating: undefined out of 5
Keywords: LlamaIndex, Embedding Documents, openai, chatpdf, chatgpt tutorial, langchain, langchain chatbot, langchain pdf chatbot, langchain chatbot memory, langchain agents, prompt engineering, chatgpt api python, langchain ai, llm, langchain llama, chatgpt, chat multiple pdf, multiple pdf chatgpt, chatgpt plugins, chat with your data, chat with your documents, private gpt, artificial intelligence, ai, chat with files, open-source gpt, gpt-4, gpt4all, gpt4all langchain, python
Id: aBdzi3N4Wno
Channel Id: undefined
Length: 5min 19sec (319 seconds)
Published: Mon Jun 05 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.