Build a ChatGPT With Your Own Data | LLM, Embeddings, Vector Store Explained

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Imagine, you are a business owner you have tons  of companies documents on your computer, some meeting notes on Notion, some other notes on Evernote and messages on multiple Channel like   Slack and WhatsApp but you constantly waste time  searching for information in multiple places.  And a few months back you heard about people using  Chat GPT as their personal assistant - So how can help you? - Oh it's just more that everything just  feels disorganized. That's all - You mind if I look through your hard drive but you don't really see  how this can apply to you because your documents   are private and may contain sensible data also  you can't just paste all of your documents in   Chat GPT as you have hundreds of them in a lot of  different places. What if I told you that there is a way to build a personal assistant that was  fed with all of your company knowledge base   Hey friends and welcome back to the channel I'm  Dona a software engineer based in Sydney and in this series of video, I will show you step by  step how to build exactly that and in this specific video, I will be focusing on the different  Steps and Concepts to understand the big picture. Let's start with our first concept LLM. LLM for large language models are nothing else than a   really smart fancy long form autocomplete. When you  send your request to a LLM like GPT for example it will search what is the most probable answer  based on the billions of data that it was fed on and return it to you. And instead of giving you  a single word it gives you an entire generated sentences  in this example I send the prompt the  mouse is eaten by into the LLM and it gives me   back the highest possible answer. Now generalize  that to a whole sentence and you understand how LLM works. Coming back to a problem you could  think "Great then I can just create my own LLM   with my own documents" Yes you could but that is  really expensive and it wouldn't be as performant   as one that has been trained on billions of data  already and if you want to retrain an LLM with   billions of data it's really really costly so  except if you Elon Musk, "Hi Elon" and you want to spend a few million dollars that's not really  a solution here and if you are Elon Musk then   send me an email and I would love to interview  you on my channel so you could think   "Okay then I will just paste all of my documents in the prompt, on chat GPT for example" Well, the limitation here  is that it's not scalable also a LLM like GPT3 is limited on the context length that means that you can't just give them your entire documents  and ask it to search through it. For example GPT 3.5 prompts are limited to 4,000 tokens, which is  around 8,000 words, which is around 16 Pages at A4 format so you're going to ask me okay then  what's the solution I want to see Solutions, Gentlemens, Solutions... Well, the solution is to use  multiple Concepts, such as chunking, embeddings and   vector store. I will explain to you all of these  concepts in this video and let's get into it So let's come back to our original problem. I have  multiple PDF documents some text and some WhatsApp   conversations for example. The first step is to  split all of these texts into small pieces of data called chunks Here I will divide all of  the texts into small meaningful chunk of data   so let's say around 200 characters maximum  I will also cut when there is a new line to   avoid mixing context. Here you can see for example  how I divide the universal declaration of human rights. Now that this is done, I will create  embeddings for each chunk of data and you're  going to ask me "Dona what's an Embedding". Embeddings are a way to represent information text, images or even audio into a numerical form or  also known as a vector imagine for example that  you want to store the following words Orange, Apple, King and Queen. This could be store as the following: Orange (2, 2) Apple (3, 3) King (7,6)  and Queen (8, 6). If we represent this inside a 2D graph you will be able to see that Apples  and Oranges are closer to each other than Queen   and King are also next to each other and that's  because they are more alike apple and oranges are   fruits while Queen and King are more related to  people But now you might wonder how to generate  these vectors and that's our second concept that  we call Embedding Model to transform our chunks  of text into embeddings an embedding model is used. An Embedding model is nothing else than an algorithm   set to transform a data into Vector. Different embedding models will lead to different results   Going back to our example, another algorithm could  look, for example, and consider similarity as if the words are next to each other in alphabetical  order. In that case Apple would be closer to King   than to Orange. You must say it doesn't  make much sense and you're right but that's just   a simplified version to show you how this whole  thing works. In reality, these vectors contains   thousands of dimensions instead of just two which  makes it way more powerful and accurate and also   the embedding models are way more complicated than  this algorithm I was just showing you. Once we have transformed our chunks of text into embedding,  we need to store them and we're going to use   a vector store for that. Vector stores are a place  where the embeddings are all being stored. As there   are thousands of dimensions per embedding, every  vector store will have their own algorithms to store, index and search through them.   As a summary ,all of the documents that we want to add to our personal assistant will be divided into chunks of text then we will use an embedding model to create   our embeddings and store them into a vector store.  Before jumping into how to query this Vector store   if you like this type of content, please leave  a thumb up and subscribe to the channel. Now, let's see how it works to query this Vector store  and get an answer back from any question about our document. The user will write a question and we  will query the vector store with that question   let's see what happens, under the hood, during  the vector store retrieval The user question will also be transformed into an embedding and now that our question is also in the graph we can easily see what are the vectors that are the  closest to that question It will return a list of relevant embeddings that can then be transformed  to chunks of text. Now we have the most relevant chunk of text related to the question. The next step is to use a LLM and give it the relevant   chunk of text and the user's question to generate  a proper well formatted response to our user and that's it. There was a lot to digest in this  video so don't hesitate to go back and pause   the video to understand the whole concept. If you  have any questions let me know in the comments and   I will be happy to answer them. In the next  video, we will get our hands dirty and into   the code to implement these solutions. You will  have a folder to put all of your documents and   you will be able to ask anything about them.  Thanks for watching and see you in the next video!
Info
Channel: Donatien Thorez
Views: 1,446
Rating: undefined out of 5
Keywords: artificial intelligence, ai, ai assistant, the rise of artificial intelligence, artificial intelligence news, chatgpt, langchain, llm, llm explained for dummies, llm explained simply, llm explained, AI, ChatGPT, OpenAI, LLM, Chatbot, Chat, Vector Store
Id: EHyfNpnRiMQ
Channel Id: undefined
Length: 7min 27sec (447 seconds)
Published: Wed Sep 20 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.