How to Build a Custom Knowledge ChatGPT Clone in 5 Minutes

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I just built this AI chatbot that has a custom knowledge base in just a few minutes in this video I'm going to show you how to do it and give you the exact code so you can steal it and use it in your own projects the problem of adding a custom knowledge based factor to your llm applications is want to pop the bait right now I'm going to be walking you through step by step how I've been able to do this and how you can do it too and at the end I'm going to be giving you all of my code so that you can copy and paste it and apply it to your own businesses or even create your own business off the back of these as well now what's the point of creating a custom knowledge chatbot anyway these large language models like gpt3 are limited in their knowledge up until about 2021 now a lot of the simple applications for llms are being snapped up which use just a gpt3 API to do a simple function the ability to put additional information into a llm and query it about that information opens up enormous possibilities for creating new businesses and I'm going to show you exactly how I've been using this to do the same now an existing example of this that I want to show you is called chatpdf.com now this website takes a PDF scrapes with the information off it indexes all that information and then allows you to chat to a chatbot about the content in that PDF what I'm going to be showing you goes beyond just PDFs but I want to show you quickly on screen here the kind of result that we're looking to get and the ability to take an external information and then talk about that information which is what we're after in this video so I've been able to search on this site for a PDF this time I've gone with something that wouldn't have been included in the original gpt3 training so I've gone with Formula One 2022 regulations so we can now talk to this and it shouldn't know anything about it unless it's come from the PDF we've chosen what are the regulations regarding the usage of power units and Formula One cars in 2022. now as you can see on screen here we've been able to talk to the chatbot about the specific content of this document according to page 82 what else does it say on page 82 please summarize it and it's able to tell me page 82 XYZ so this is the kind of functionality and chatbot that we're going to achieve in this video but I'm going to give you a ton of different ways to onboard different forms of data you're not just going to be able to onboard PDFs but trans scripts Discord messages Google Calendar all sorts of information that you can onboard into these models so that you can use it in a chatbot style interface now enough talking about it let's get onto the keys and I'll show you exactly how you can do this with some simple python code so what we're going to be using in this video is called lamma index I'm going to be leaving links to the documentation in the description but for now I'm just going to run you through the basic usage pattern so you can understand what it's doing and then we can start getting into some interesting use cases for it so to start with we just need to import OS and then set our openai key first thing you need to do is to take all of your data using one of their data loaders and load it into a documents variable for later use now to show you this in action in a very simple way I've taken a article by meta AI discussing the release of llama their live sandwich model which I've just put into a raw text file and saved to a data folder which I'm going to be calling in a second the reason I went for an article on llama is to show you that it really is taking information that it did not know and being able to tell you about it this is because lawnmower was released very recently and therefore there's no way that gpt3 is able to to understand what llama is so to do this I need to import from llama index there's simple directory reader and they don't need to call it with this function here and reference where I'm pulling it from which is just this data folder here then I'm going to load this data into my documents variable I need to use those documents to create an index it's very simple to do you just need to use from Lama index import GPT sample Vector index and then call this function here what this is going to do is take your documents and create an index out of it that you can then query for answers so I run this it creates the index and then I'm able to do this very simple command of index.query and then enter the question I want to ask it and then it's going to print out the response what do you think of Facebook's llama is my question and then the answer is I think Facebook's llama is a great step forward and democratizing so it knows what we're talking about so now that you understand the basic usage pattern of llama index which is to load some data and create the index and then query that index we can now go into some more advanced topics of how we can adjust the models that we use in order to get different results and more specific results so we can jump onto this now now the index we use in the first example is a GPT simple index and so now I'm going to be showing you how you can adjust the model that you use and adjust the settings within that model so that you can get different kinds of outputs so this can be easily done by importing an llm projector and we can set up that llm predictor with our llm and determine the temperature which I'm sure you're familiar with if you've watched my recent videos and we can change the model name so this essentially gives us more flexibility in terms of what we want to be using as part of this indexing engine and the kind of results we want to get by specifying the temperature the model the frequency penalty Etc as you're used to working with other models once we've set up the llm predictor all we need to do is create a prompt Helper and give it a few parameters such as the max input size this basically ensures that you're not going to run into any issues when you're praying it because of this custom llm that you're using so once we have the prompt helper you can just copy and paste all the stuff I know it looks a little bit complex but I promise you it's pretty basic stuff just copy and paste it over and that prompt Hub will be set up already for you now you need to set up the same class of a GPT sample Vector index but this time I'm going to include the llm predictor which is going to take in the information we gave it here which is the adjusted temperature and a changed model from the original that gives us the custom llm index variable here which we can then query with our own questions so using that same pattern we can now go index.query and then ask it our question as you can see we got a completely different answer from this text DaVinci 002 model to the original which I believe would be text DaVinci 003 so by changing these models around you can get it to do different things for you now we can get on to the good stuff which is showing you how you can use different llama index loaders to load different information into these and get really cool results out of it now in order to give you an example of the huge range of data types that llama index allows you to get on board I'm going to head over to the documentation here and then we can head to what they call Llama Hub now once you're on llama Hub you can see all these different data loading methods this is the real magic of llama index and I'm going to be making a ton of videos on all these different data connected types so that you guys can see these in action so for example here we can click on Google Calendar and this is going to read your Google calendar passes the relevant information to documents then allows you to query the index and ask about what's coming up on your calendar now if that doesn't set off light bulbs in your head I don't know what will and the list goes on Google Docs loader you can easily load Google Docs information into your models you have notion loaders you have slack loaders Twitter loaders WhatsApp loaders all sorts of things but what we're looking at here is the Wikipedia loader here you can see all we need to do is import the download loader and call the Wikipedia reader and then we can give it any kind of Wikipedia page name and it's going to take all that information and make it available through our index so here's an example I've got in here and again I'm trying to use some very recent data to prove that this is not just the gpt3's underlying knowledge base but we're actually giving it a custom knowledge base so here we can see something called Cyclone 3D which has happened recently 2023 tropical Cyclone Freddy longest live tropical Cyclone on record so this is obviously very recent information and now we can take this over and try to get an indexed so we can ask it about this particular Cyclone this is very simple to do in typical alarm index fashion all you need to do is instantiate this Wikipedia reader class and then we have ability to load in Pages by simply naming the pages that we want which in this case is Cyclone Freddy run the cell and as before we need to create an index using the documents we've created which are called wikidox here run the cell again now the index is made and then all we need to do is query it as we have before in the same pattern so you load the documents you create the index and you can query the index same thing over and over and over again what countries were affected by Cyclone Freddy Cyclone Freddy affected Madagascar Mozambique and Zimbabwe which is correct now please note that I did have a few issues with the Wikipedia loader here I'm not sure why but it seems to insert and change characters depending on what you put in as a page name so had to play around a bit to get the Cyclone 3D working but this is just one example and I'm sure they'll fix it maybe it'd be better if they got you to put in the link to the page rather than the name but I'm sure it'll be fixed soon enough and now we get on to a bit more of a practical example so this is a customer support bot that you're able to make by putting in your frequently asked questions all sorts of stuff from your existing customer support resources and then create a index you can query and ask questions about your your frequently asked questions so I've gone to Asos I've headed to the customer support section and I've gone into all of of these different questions and answers that I've set up and I've copied all of the information and I've put it into just dot text files so we have a bunch of different information here that we can then query and as before all we need to do is use the simple directory loader and now put in the directory to this Asos folder that I'm referencing here and all I need to do is load the data into that document and as before we just need to run this GPT simple Vector index include our documents and it's going to index all that information in those customer support little text files that I've got and now I can ask it what Premier Service options do I have in Saudi in Saudi Arabia you have the option of signing to the Asos Premiere which gives you free standard and I can change this again in the UAE and it gives me the information regarding the UAE which I have in the documents here and yes it is that easy all you need to do is take your information put it in a folder tell it to load it and then you can start asking questions about it now I want to give you one more example of a data loader that is available through llama index which is a YouTube transcript loader so as we can see if we come over here and we find YouTube transcript this loader features the text transcript of YouTube videos using the YouTube transcript API python package very very simple to implement as you can see here and over here I have the same thing now what I've done is taken a Dave Nick Video is complete YouTube automation tutorial for beginners which I feel like is a fairly good use case say if you wanted to put a massive YouTube video on that's like a two hour course and you wanted to get some feedback and talk to essentially query the information within that course you can put it in here and get a index created on it so very simple to do again you're just instantiating this YouTube transcript reader putting in the URL as it says in documentation for the YouTube transcript loader run the cell and create our documents object create the index and so here I've asked her the question what are some YouTube automation mistakes to avoid which are mentioned in the video there's a section on it and it's able to tell me re-uploading people's content without permission and goes through a list of the things you mentioned in the video now with those examples out of the way I know what you're thinking this doesn't look like a chatbot to me and you'd be right this is not a chatbot this is simply querying an index we're able to send it one question and send us one answer back and there's no conversation history or context going on here I couldn't find anything online on how to convert llama index as querying system into a chatbot so I took it upon myself to figure this out myself and I was able to create this chatbot class here which is working very well and you guys are going to be able to copy and paste this over to your projects in the description I'm going to have all the code so you can get access to it now I don't want to bore you with the technical side of things of this but essentially it creates a messages list very similar to the chat GPT API which I wasn't able to get integrated into this so I had to essentially do it myself now while I assume the Llama index team is still working on getting a chat GPT API integration installed where ever since you'd be able to create our own version of it here by using messages every time you send a message in a response it's added to a messages list and then every time you query it it remembers those messages so that you can have some kind of context running through the chat now all you need to instantiate this chatbot is to put in your API key and give it a index that you want to use in this case we're going to be using the Cyclone 3D Wiki index again and then you're able to start this up and start chatting to it now again to make sure this is not using any other sources of information or existing knowledge I'm going to be using an index that is based on the Llama article that we use in the very first example so I've gone and run this chatbot so now I can ask it what is llama llama is a state-of-the-art foundational large language model who created it so I can say who created it and just say it and it remembers what it said previously and knows that I'm referring to llama and it responds with meta created llama light language model meta AI now while this does not look flashy because we're hearing Jupiter notebook it does show you the ability to load your own custom data set and teach these models something that they don't know gives them a new custom knowledge base and then not only query them in a in a sort of single shot method you can then query them and have a chat interface with them and go back and forth remembers the previous messages all using this code which is going to be available in the description so if you want to take all of this code and copy or paste it over to your own a custom knowledge base then you can do that down below now I really really urge you guys to look at how you can apply this to businesses with the chat GPT API coming out it has opened up so many doors if you're able to load some kind of custom data into these models and create a chatbot like this you can create a business overnight with this kind of Technology now please after this video don't just leave it here go to the Llama Hub and have a look at some of the data loaders that they have available it will really open your eyes and just go through all of them I had such a good time going through and understanding the different ways and different data types and sources that llama Hub allows you to load into their llama index things like slack Discord Google Calendar Reddit CSV files all this sort of stuff can be loaded into these indexes and used for in your own chatbot I'm going to be looking to do some cool YouTube videos for you guys on this how you can create a quick application using some of these loaders and myself my development team have also been working with clients using this exact strategy already to create custom chat bots so if you have a big idea and you want to talk to me about it or you want to speak to me as a consultant or you want my development team to build you out an entire application entire platform then hit down below the link is going to be available if you want to have a chat with me or get something built with me and my team so thank you so much for watching guys that is all for the video but if you want to get this code remember head down to the description you can copy and paste it over to your own projects and get something like the spun up and if you really enjoyed the video please leave me a like it helps my channel so much now if you like this kind of content Please Subscribe as well and hit the Bell so you know when I post my next one but that's all for today thank you so much for watching and I will see you in the next one thank you
Info
Channel: Liam Ottley
Views: 68,810
Rating: undefined out of 5
Keywords: how to start an ai business, how to use ai in my business, 2023 business ideas, internet business 2023, online business 2023, best online businesses, make money online 2023, small business ideas 2023, ai businesses, ai startup ideas, artificial intelligence startups, how to start an ai company, no code business ideas, Creating an AI Tool Without Writing a Single Line of Code, no code ai startup, how to make an ai tool, how to make ai tool website, bubble.io tutorial, llamaindex
Id: sUSw9MaPm2M
Channel Id: undefined
Length: 14min 3sec (843 seconds)
Published: Tue Mar 14 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.