8 Minutes LangChain OpenAI Beginner Tutorial | ChatGPT with your PDF

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
if you haven't been living under a rock you've probably heard of Sam Altman CEO of openai which has just released chat GPT since then a lot has been happening in the space so here's a quick refresher cat GPT is powered by an llm which stands for a large language model these models are trained on billions and billions of textual data which led to the creation of gpt4 by the openai team there are also other large language models which have just been released or are still in beta llms are complex pattern recognition machines doing things like writing essays an hour before the deadline is now a walk in the park since they have seen so many examples and the training data however one issue in particular with the GPT models trained by openai is that they've been trained on data up to September 2021 ask it about anything that happened after that and it might try to generate an answer that sounds satisfying but is false in this beginner tutorial we will be using Lang chain an abstraction over llms that will enable us to easily chat with a PDF created after the September 2021 cutoff date but before jumping into the code let's talk about the architecture first we will want to parse the PDF that way we will be able to get the text for each pages of it then we will chop the Page's content into multiple chunks we do this because we're trying to keep the context relevant within a chunk every time we will ask or chatbot a question it will ultimately retrieve four chunks to help it construct its answer but how does it know which chunk to use that's where embeddings come into play embeddings are high dimensional vectors that aim to represent the meaning of the text Chunk let's say we want to classify text chunks whether positive or negative using only numbers 0 means negative and one means positive then an embedding of happy would be one an embedding of sad would be zero a warm welcome would turn into one an unfortunate event zero this is fine but very limited that's why instead of assigning simply one number we can assign it multiple numbers enabling us to represent more nuances of the text instead of Simply positive or negative openai offers an endpoint to do this for us finally generating these embeddings is cheap but still costs money so we'll store them in chroma DB an open source Vector database blank chain will be mostly used in step 3 and 4. the abstraction it provides means we could swap openai embedding endpoint for another one like cohere or change the vector database to Pinecone without requiring much code changes now let's get into the code to get started we'll need an API key from openai link will be in the description below if you don't have an account you can go ahead and create one otherwise click on login we'll click on the top right corner of our screen and then select view API keys and here we can create them the name doesn't really matter we can always change it or delete the key itself we'll copy it over before clicking done then we'll paste it in our DOT in file so now we can start implementing the ingestion part the code will be in the description below such as follow along and if there's anything unclear rewind or take a look at the code repository first we'll load the environment so that link chain can use to open AI key we'll create function that returns to us the raw text and metadata from the PDF the parse PDF function will take in the file path and return it Tuple the first element will give us the page number in raw text the second element will give us the metadata of the PDF in the form of a dictionary to extract the metadata we'll use the pi PDF light we read the file in binary create our reader and take the title author and creation date to capture the text of the PDF we'll use PDF plumber to help us out we'll collect tuples containing the text and its page number we'll go through each Pages check if there's text and append the table once that's done we return what we've found coming back to our main function we now have parse the PDF we have the pages and the metadata we'll clean the text in order to help with readability and produce better results and then chop the pages into text chunks as this is a demo to reduce embedding costs we'll only keep the first 23 Pages this is optional and totally up to you to clean the text we'll go through each pages and apply your cleaning functions to them essentially returning a list of tuples containing the clean text paired with page number we'll use regex Expressions to clean our text merge hyphenated words fix new lines and remove multiple new Now to create the text chunks Lang chain provides the recursive character's splitter class we'll take chunks of a thousand characters with overlaps of 200 characters who will create a document object for each chunk it will contain the text metadata such as the page number and the chunk number The Source tag will help us track the chat Bots and we'll pass along metadata we collected from the PDF and finally return the documents the more involved part is now done for step 3 and 4 we'll choose open AI to make our embedding Lang chain provides a helper method on the chroma class using it will embed all of her text chunks and store them in the vector database we simply have to specify the persistent onto our hard drive and that's it here's a look at what we've done so far we've parsed the PDF into multiple Pages made a few text chunks for every page generated in a batting for every chunk and stored it with its text Chunk into the vector database now to actually have a conversation with her PDF this is how it's going to work out at a higher level first the user is prompted to answer a question then a condensed question prompt is going to take in your chat history and your last question link chain will call openai to rephrase your question given the context of the discussion once it gets the new question it will embed it so that it can now do a similarity search over the vector database it will take four text chunks that have embeddings that are close to the embedding of our new question in other words they have high probability of having information that will help answer a question these four text chunks are grouped with our new question into a prompt that will be sent to open AIS API the large language model will then be able to use our text chunks to answer our question so we can finally get the answer back I know this is a lot and thankfully Lang Ching provides a toolkit that makes this a total Breeze to write let's get into it once again we'll load the environment which has our openai key we call our function make chain which will return a link chain chain object this is something I might cover in depth in a following video so don't forget to subscribe to be notified in this case put simply it represents the abstraction we used to interact with the Open ai's chat endpoint we'll also want to keep track of the chat's history we'll put the rest of our code in an infinite Loop we prompt the user for a question and then run the chain on the question and chat history we get back an object containing the answer and the four text chunks it used thanks to specifying the metadata we know exactly which Pages were used by our chatbot we can now append our question and its answer to our chat history for the follow-up question and finally we get to print the answer and its sources now let's write our mid chain function we'll use openai's large language model and we can specify the model name to be GPT 3.5 turbo again we'll choose open ai's embedding since everything in our chroma DB has been embedded using open AI we'll specify the same collection name we use in the ingestion part pass the embedding function and tell Chromo the directory we use to store the data we can then easily instantiate a chain by using a helper method we'll pass in the large language model we're using in a DB Retriever as dependents we'll also ask it to return the sources it used and that's it here's a quick demo of what we've got the link of the PDF is available in the description below and in the code repository it's the international monetary funds World economic Outlook from April 2023. so now let's ask it what economic impacts were unforeseen you can see it use chunks from page 10 16 18 and 21. the answer mentions how the inflation expectations are now higher than previously forecasted which seems in line with what we've been hearing in the news this PDF was clearly created after the cutoff date so there was no other way I could know let's give it a follow-up question and ask it what events might have affected the inflation we're currently seeing and here it gives as an answer a list we can see it mentioning the war in Ukraine which happened in February 2022 and this wraps up our tutorial if this video helped you please like And subscribe thanks for watching I'll see you later
Info
Channel: Edrick
Views: 16,879
Rating: undefined out of 5
Keywords: OpenAI, ChatGPT, LangChain, ChromaDB, Tutorial, Beginner, EasyTutorial, AI, ArtificialIntelligence, MachineLearning, LargeLanguageModel, LLM, Course, PDF, Concepts, Guide, Education, Programming, Software, Engineering
Id: FuqdVNB_8c0
Channel Id: undefined
Length: 7min 47sec (467 seconds)
Published: Thu May 18 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.