ChatGPT With Your Docs | Full Tutorial WITH Code Examples

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
foreign [Music] have you ever wanted a chat gbt but specifically for your own documents well I'm here with the founder of blank chain Harrison and we're going to do a quick tutorial on how to build question answering over documents similar to chat gbt my name is Rachel I'm the founder of the AI exchange and we are a community of people who are actively applying AI into their work and products and this is Harrison do you want to give a quick intro on yourself absolutely hello I'm Harrison founder of link chain a python package instead of tools developed for making it easy to develop language model applications awesome so in this video we're going to walk through this collab notebook the Harrison is prepared we will also add a link to this notebook in the description and our goal with this video is to help you feel confident understanding how you could build a q a over documents using link chain but also more generally how do these systems work and how can you build this type of product for yourself or in your own company Harrison do you want to go ahead and kick us off excited to get started yeah so we're going to start by importing the python packages that we need if you remember from the starter video the first two that we're going to use are open Ai and Lang chain and so open AI is the language model provider Lang chain provides a lot of the glue and the functionality for for constructing complex chains and then we're also going to add a third which is face CPU we're going to work with a state of the union text this is just a long document and the purpose of it being a long document is that it's actually too long to pass into GPT by itself and I think that's an interesting challenge that you'll quickly to get to if you try to do question answering over your own documents because you generally want to do them over a large collection of documents and in this case it's a single document but it's very easy to extend into multiple documents we're also going to do the setup of setting up our environment variables with the openai key and then we're ready to go so the first thing that we're going to do is take this long text and we're going to split it up into smaller chunks and the reason that this is important is basically we want to be able to pass in the most relevant pieces of text to the language model when we're asking questions about it as mentioned before we can't just pass in the the whole text itself because that will run into some context window errors and here we're going to split it up into chunks of length a thousand you can kind of pick and choose your your own size there's not a lot of science that I've read about how big those chunks should be I think that's an underexplored area of research so we're going to take in this text and we're going to split it up into a bunch of smaller texts and then what we're going to do is we're going to create embeddings for these texts and put them in a vector store so what are embeddings what is a vector store and why is this important all right so you have a piece of text embeddings is basically a numerical representation for that text the reason we want to come up with a numerical representation is it makes it really easy to do kind of like math equations with that text so you can find other pieces of text that are similar to it that's the main use case we'll be doing for this video we're going to create these embeddings for each of the chunks and then we're going to put them in a vector store I'm in a vector store is basically a place to store vectors of numeric numbers in this case the embeddings and they're optimized for doing things like looking up similar vectors so it's basically just a place to put these embeddings where we can look up which embeddings are close to other embeddings very easily so we're going to run this and it will take a little bit because we're calling open ai's embeddings under the hood and we're storing it in face which is an open source Vector store there are other Alternatives as well that are hosted solutions that you can look into but for a quick start we're going to use face awesome one question on the chunking I guess kind of while this is running so right now you are feeding in just a straight set of text is that correct yeah what are some other types of texts that you've seen fed into a system like this and how does that impact chunking yeah I think chunking again is an underexplored area and so I think what most people generally do is they first load the documents as text they then clean them up a little bit and remove some of the crust and then once they've got this unstructured text then they start splitting it into chunks the loading is a bit specialized to the types of documents that you're interested in doing search over so if you're interested in doing search over notion for example I have a separate Repro that we can link to in the description that loads it from the notion database if you're interested in doing it over Google Drive you you need to write or use some logic to load from Google Drive so the loading of the text is like one thing that's a bit specialized another thing that's specializes is the idea of splitting text the main consider iteration here is that you want semantically meaningful pieces of text so by default this recursive character text splitter starts by trying to split things into paragraphs so it looks for double new lines basically you can imagine that sometimes this isn't good because there could be a really long paragraph and so then what do you do so then the recursive character text splitter basically tries to start off with paragraphs and then goes into sentences and then if it needs to like the characters themselves and so that's kind of like a general approach to it I think there's a lot of interesting work to be done in figuring out what the best way to split things in a semantically meaningful way actually is so you can imagine with a code base you probably want to split it so that the functions are together in the same chunk as much as possible you know you probably don't want to split in the middle of a function that's one example you know for books maybe you want to split them by chapters first and then paragraphs or something like that I think there's probably some general stuff you can do and I think the recursive text splitter is a good stab at that but for a lot of specialized applications you may want to look at the data a little bit more carefully and come up with some big way of splitting text because I do think it can be important to maintain the semanticness in in chunks yeah I think to other pieces of advice that I would add here is one again going to the semantic meaning like really thinking about what is the unit of symmetric meaning in the data you're feeding in for example if you're trying to use this for tweets it's really nice because each tweet ideally has kind of a different semantic meaning um and then the other piece of advice is if you try this approach with one type of chunking and it doesn't work very well then experiment with other types of chunking and that can be a way to have performance gains over the overall application really I think while there's not a best practice of how to chunk today it might really vary based on the application and what you're trying to do so just experiment with more things I think how you chunk is definitely something to think about related to how you're trying trying to use this okay so it looks like we're using embeddings from open ai's embeddings is that the only way to generate embeddings or are there other ways to generate embeddings there's a bunch of other ways to generate embeddings they're similar to language models most providers also have embedding support so there's cohere embeddings that you can access over an API similar to open Ai and hugging face also has a lot of embedding functionality that you can either use off the hugging face Hub or run locally all right so we've got this Vector store set up now we need to set up the logic for doing question answering over it this is one of the more common chains and so we're going to use a pre-built chain that's in link chain we call the vector DB QA train it's basically question answering over Vector DB you can see that we pass in a language model here and we're using one from openai we then specify the chain type there's a few different ways to do question answering over documents stuff is a great name and it basically refers to you get the relevant pieces of text and you just stuff them in the same context and then you pass that to the language model this is the fastest because you're making one call to the language model it's generally the best performant if you can do it because it has all the context in one in place the downside obviously is if you have a lot of documents to combine you can't do that because it will start getting over the context length window there are other methods in Lane chain like mapreduce that can get over this in a bit more detail but we'll stick with stuff for now but then we pass in the vector store as well with these three components we can initialize this question answering chain about other types of chain types where could they go we can provide a link to the documentation in the description of this video there's a good notebook that walks through the four different types of chain types that we have available and there are pros and cons to each one so I do think it's good to get some intuition around there and then the other thing that I'll note that will also be in that same notebook is basically you can customize the question answering with different prompts so if you notice here we don't specify a prompt and that's because we're using some pre-existing hard-coded prompts but you could very easily imagine wanting to customize the prompts yourself and add some bit of information about how the language model should respond and what type of things it should try to answer the question about this notebook will also go over how to customize the question answering to your specific use Case by passing in different prompts and different prompt templates in particular which we covered in the last video all right so we've got this set up and now we can start using it with this nice little run function we can ask what did the president say about katanji Brown Jackson and it's running a little bit and then it spits out an answer what it's doing in this chain is it's taking this query which is the string it's created an embedding for this query and then it's looking up in this Vector store the relevant pieces of text it's then taking those texts and it's putting them into a prompt that basically says answer the following question insert the query here given these pieces of context insert the relevant documents there and then passes that to the language model then the language model responds with this this answer here awesome so I guess why is this important as compared to just using language models straight out of the box like what would this response have looked like if we weren't using this kind of embedding based lookup approach let's find out let's write some code live let's see if my keyboard shortcuts work okay there we go so let's set up the language model here and then let's just run the query on the language model and let's see what the language model says during her nomination announcement President Biden said that Tony Brown Jackson has the potential to be one of the so you can see here the language model is answering this question based on all the possible information that it was trained on right and so there's two important things there one it's all pieces of information and two it's that it was trained on so open ads language well in this case I think was trained up to some date in 2021. basically anything that's happened since then it has no knowledge about and if we wanted to be able to answer questions about that we need to provide the context somehow we can do that by providing it in the prompt the other thing is that even if the question is related to an event that happened before 2001 it could be related to information that the language models knows nothing about you have personal notes Rachel the language model probably hopefully does not know what's in those personal notes but you could imagine wanting to ask questions about them even if they were from like five or ten years ago this is a way to basically get answers that are grounded in that information as opposed to the full embodiment of all the information that's in the language model so it sounds like really the value here is a few things one you can control the context that gets put into the prompt so you have more control over the output which can solve for things like it being factual or not and then second you can even give it access to new contexts that maybe openai didn't have originally when it was being trained such as yeah my personal notes I could build this type of system querying over my personal notes and then be able to get Specialized or customized responses basically based on those notes yeah that's exactly right that's a great way of putting it I think this is important because if your application doesn't have kind of this type of personalization or this type of specification on specific proprietary or or unknown documents there's no differentiation between an application that you build an application that anyone else builds it's really kind of like bringing this extra stuff and combining it with the language model that gives it a lot of differentiation and makes it really interesting and unique for kind of some of the reasons that you just mentioned in terms of grounding it and factuality awesome another common question that I hear often is how do I prevent chat GPT or GPT from making things up a thousand years in my product could you describe is this an approach that you could use to prevent hallucination of information yeah it absolutely is and I think the best way to do that with language models these days is you tell it not to make things up but you basically tell it answer this question but only use this specific piece of information that's when it will ground it hopefully in that piece of information another thing that you can do here and maybe we can do another video on this later you can get sources for where it gets its information from you can imagine that after we create these chunks we can create an ID for each chunk if these chunks of text are created from different web pages or something part of that ID could contain the URL or the page that it's on when we start to answer questions we can start including in that Source in our answers and again how do we do that we ask the language model very politely to not make things up and please cited sources when it does so excellent great do you want to continue through the rest of the notebook yeah the rest of the notebook is just highlighting in an even easier way to kind of like load this chain basically you can see here we had to import the chain from a specific class and use a method and pass in a language model object here and pass in a chain type here another way to do that is to load a chain from the lane chain Hub which is basically a way of sharing pre-configured chains you notice that we still have to pass in the vector store and again that's because the vector store provides the interesting information and so it wouldn't make a ton of sense to serialize it along with the chain but we have this chain that knows how to interact with the vector store and so we can run this and it's pulling this pre-existing chain down from the link chain Hub and now we have uh the exact same chain as above we can run it just as before and get a similar answer I just want to highlight this because this is a new feature that we've added and we can include a link in the description below but basically the idea here is just to make it really really easy to load these types of chains we're optimizing for making it extremely easy to build language model applications and as you notice this is One Import and one line of code that's what we're striving for and we're also intending this to be a place to share change so I mentioned before customizing the prompts to include different variants of tone or other things like that we're hoping to gather kind of like a collection of these variants of prompts which in turn are variants of chains host them in this Hub and then people can download these different variants and play with them pretty easily awesome I think this is a great overview of how to do question answering over documents I'm curious Harrison to close this out what are some other ways that people can use this General framework described here when they're building these LM applications yeah I think the general framework here is basically pulling in relevant context and inserting it into the prompt one method is doing it in exactly this way where you're pulling in content that you want to get answers for or basically auxiliary information in that capacity another way that you can pull in extra content based on the query is actually for some of the few shot examples that we've had about last time depending on the query you may want to give the language model different examples of how it should behave that can be really useful because you can provide examples that are similar to The query and so it can learn to do things in a way that it should be doing based on the query a third way which is even more extreme is you can imagine basically changing the entire prompt based on the query if the query is asking about one task compared to another you may want to use a different prompt template entirely this all gets back to the same idea of basically you've got the query that's coming in you now need to decide how to include information to pass the language model those are three applications of that awesome well I think this is a fantastic overview of a common use case which I get asked about a lot so I'm really excited about this notebook again we will share links to a lot of the relevant documentation that we described during this tutorial as well as a link to this collab notebook for anybody that wants to dig in and get started using link chain in their applications or test it out thanks so much for watching Harrison any closing thoughts thanks for having me Rachel a pleasure as always awesome all right and until next time foreign [Music]
Info
Channel: Rachel Woods
Views: 26,797
Rating: undefined out of 5
Keywords:
Id: kM3DPWO7YV4
Channel Id: undefined
Length: 16min 20sec (980 seconds)
Published: Tue Feb 14 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.