Build a ChatGPT With Your Own Data | LLM, Embeddings, Vector Store Explained

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Imagine, you are a business owner you have tons of companies documents on your computer, some meeting notes on Notion, some other notes on Evernote and messages on multiple Channel like Slack and WhatsApp but you constantly waste time searching for information in multiple places. And a few months back you heard about people using Chat GPT as their personal assistant - So how can help you? - Oh it's just more that everything just feels disorganized. That's all - You mind if I look through your hard drive but you don't really see how this can apply to you because your documents are private and may contain sensible data also you can't just paste all of your documents in Chat GPT as you have hundreds of them in a lot of different places. What if I told you that there is a way to build a personal assistant that was fed with all of your company knowledge base Hey friends and welcome back to the channel I'm Dona a software engineer based in Sydney and in this series of video, I will show you step by step how to build exactly that and in this specific video, I will be focusing on the different Steps and Concepts to understand the big picture. Let's start with our first concept LLM. LLM for large language models are nothing else than a really smart fancy long form autocomplete. When you send your request to a LLM like GPT for example it will search what is the most probable answer based on the billions of data that it was fed on and return it to you. And instead of giving you a single word it gives you an entire generated sentences in this example I send the prompt the mouse is eaten by into the LLM and it gives me back the highest possible answer. Now generalize that to a whole sentence and you understand how LLM works. Coming back to a problem you could think "Great then I can just create my own LLM with my own documents" Yes you could but that is really expensive and it wouldn't be as performant as one that has been trained on billions of data already and if you want to retrain an LLM with billions of data it's really really costly so except if you Elon Musk, "Hi Elon" and you want to spend a few million dollars that's not really a solution here and if you are Elon Musk then send me an email and I would love to interview you on my channel so you could think "Okay then I will just paste all of my documents in the prompt, on chat GPT for example" Well, the limitation here is that it's not scalable also a LLM like GPT3 is limited on the context length that means that you can't just give them your entire documents and ask it to search through it. For example GPT 3.5 prompts are limited to 4,000 tokens, which is around 8,000 words, which is around 16 Pages at A4 format so you're going to ask me okay then what's the solution I want to see Solutions, Gentlemens, Solutions... Well, the solution is to use multiple Concepts, such as chunking, embeddings and vector store. I will explain to you all of these concepts in this video and let's get into it So let's come back to our original problem. I have multiple PDF documents some text and some WhatsApp conversations for example. The first step is to split all of these texts into small pieces of data called chunks Here I will divide all of the texts into small meaningful chunk of data so let's say around 200 characters maximum I will also cut when there is a new line to avoid mixing context. Here you can see for example how I divide the universal declaration of human rights. Now that this is done, I will create embeddings for each chunk of data and you're going to ask me "Dona what's an Embedding". Embeddings are a way to represent information text, images or even audio into a numerical form or also known as a vector imagine for example that you want to store the following words Orange, Apple, King and Queen. This could be store as the following: Orange (2, 2) Apple (3, 3) King (7,6) and Queen (8, 6). If we represent this inside a 2D graph you will be able to see that Apples and Oranges are closer to each other than Queen and King are also next to each other and that's because they are more alike apple and oranges are fruits while Queen and King are more related to people But now you might wonder how to generate these vectors and that's our second concept that we call Embedding Model to transform our chunks of text into embeddings an embedding model is used. An Embedding model is nothing else than an algorithm set to transform a data into Vector. Different embedding models will lead to different results Going back to our example, another algorithm could look, for example, and consider similarity as if the words are next to each other in alphabetical order. In that case Apple would be closer to King than to Orange. You must say it doesn't make much sense and you're right but that's just a simplified version to show you how this whole thing works. In reality, these vectors contains thousands of dimensions instead of just two which makes it way more powerful and accurate and also the embedding models are way more complicated than this algorithm I was just showing you. Once we have transformed our chunks of text into embedding, we need to store them and we're going to use a vector store for that. Vector stores are a place where the embeddings are all being stored. As there are thousands of dimensions per embedding, every vector store will have their own algorithms to store, index and search through them. As a summary ,all of the documents that we want to add to our personal assistant will be divided into chunks of text then we will use an embedding model to create our embeddings and store them into a vector store. Before jumping into how to query this Vector store if you like this type of content, please leave a thumb up and subscribe to the channel. Now, let's see how it works to query this Vector store and get an answer back from any question about our document. The user will write a question and we will query the vector store with that question let's see what happens, under the hood, during the vector store retrieval The user question will also be transformed into an embedding and now that our question is also in the graph we can easily see what are the vectors that are the closest to that question It will return a list of relevant embeddings that can then be transformed to chunks of text. Now we have the most relevant chunk of text related to the question. The next step is to use a LLM and give it the relevant chunk of text and the user's question to generate a proper well formatted response to our user and that's it. There was a lot to digest in this video so don't hesitate to go back and pause the video to understand the whole concept. If you have any questions let me know in the comments and I will be happy to answer them. In the next video, we will get our hands dirty and into the code to implement these solutions. You will have a folder to put all of your documents and you will be able to ask anything about them. Thanks for watching and see you in the next video!

Info

Channel: Donatien Thorez

Views: 1,446

Rating: undefined out of 5

Keywords: artificial intelligence, ai, ai assistant, the rise of artificial intelligence, artificial intelligence news, chatgpt, langchain, llm, llm explained for dummies, llm explained simply, llm explained, AI, ChatGPT, OpenAI, LLM, Chatbot, Chat, Vector Store

Id: EHyfNpnRiMQ

Channel Id: undefined

Length: 7min 27sec (447 seconds)

Published: Wed Sep 20 2023