Create a Custom Chatbot for any Website with LangChain and Llama 2 (Free LLMs and Embeddings)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I know everyone in this video tutorial we will learn how we can create a chat board for our own website using nama2 and Lang chain so this is the complete architecture diagram for our chatbot how we Design This chatbot but before I discuss this architecture diagram let me show you one thing so guys I have just launched a lang chain pose in which I will be building 12 large language model apps using open Ai and Lamar too so here the course content details are provided over here at all the project details which we'll be building in this course are also provided and here you can check the description of the course of more detailed overview about the course and you can check out our other courses as well so you can see we have multiple object detection courses available with latest date of the art object detection models which include yellow V7 you're going to V8 you run us along with this uh 1466 students have enrolled in our course so then we have 4.5 rating out of 5 so this is a new course which I have launched one day back so here is the course and the course is available in 10 the link of the course is available in description do check out this link and let me know your thoughts so let's discuss the architecture diagram so you have uh every one of you know that uh in a website we have uh each website contains multiple uh like URLs our each website have multiple pages okay so our website can have 20 30 or 100 Pages as well so now if you want to create a chat code for our own website we need to have access to the data available on these pages so first of all we need to know how many pages are there in our website to get this information so for example if I just want to know how many pages are or how many URLs are in the open a.com website so I will just write sitemap.xml so sitemap.xml give us the information of all the pages that we have in the website okay so now you can see over here here are all the URLs provided of different pages that we have you know in this open a website so you can see and they have we have so many pages on this website you can see the long list of URLs so you can see that all these URLs are available over here of different pages we have in the open a website so in the similar way you can check out um other website as well and check out how many pages you have uh in those uh website and you can also find out the URLs by writing sitemap dot XML so you can see here we also have the URL and this is that URL so you can see that uh all the pages URLs are being provided over there and the change frequency like the content change frequency on those pages are also provided over here daily and priority levels are also provided over here okay so in this tutorial I will be showing you that uh how you can create a chatbot for your own website so using sitemap.xima we can find the URLs of all the pages that we have in the website so first of all now if I just go soon okay so I have this woman so now you can see over here using sitemap.xml I will get the information of all the pages that we have in the website okay and we can find the url for those pages as well using sitemap.xml so after finding all the pages that we have in the website I will extract the data from all those uh pages okay so I will be extracting the data from all these pages and then I will be dividing uh this data into chunks okay so here I'm really extracting data from all these pages so on okay so I will be extracting your data from all these pages and I will be just dating the text which we get into chunks depend uh that uh or text into no one channels Okay so from site web.xml we have the information like different warrants that are in the our website or different pages that we have in our website then we will extract the data from all these Pages then we'll split the data into small chunks okay then we will create embeddings for each of that chunks so embeddings are what are basically embeddings are uh you can say that embeddings are vectors which contains floating Point number so if you just want to compress the size of our text we can just convert a text to uh floating Point numbers using embedding so embeddings are vectors which contain floating Point numbers so uh now you can say that I'm bad we have them like this 0.10.30.9 okay so we will create some battings for each of the text Chunk okay so this is our weddings look like so first of all we extracted data from all the pages of the website and then uh respect the data into small chunks then we would create embeddings for each of the text Chunk okay so after create so now here you can see that we have created embeddings for each of the text charts okay so now after getting embeddings for each of the text chunks then we store the embeddings uh in the knowledge base so we will just create a knowledge base where it will store all these embeddings okay we will build cementing index and then if we store them adding a knowledge base so knowledge base is a vector store or you can say that a vector database okay so here you'll be using twice vectors stored so five is basically Facebook AI similarity search database which okay so here you can see that we have stored all the embeddings in our knowledge base okay so we have all the information that what kind of data is available in our website so in this knowledge base we have all the information of all the data that is available in the website so so now here is the user when the user asks the question we create embeddings uh so that when the user asks a question we create embeddings for the question that the user has asked for so here we are just creating embeddings for the question that the user has asked so now here and create embeddings for the question that user has asked Okay so after creating embeddings for the question like your user has asked I think it's already available so here you can see for an example these are the embeddings of the question that the user has asked so now after creating embeddings of the question the user has asked we will now do a semantics search so in the knowledge base we will have all the information stored like what kind of data is available in our in our website and this in the knowledge base this data is being stored in the form of embeddings okay so now we have just converted the question into our weddings which is a floating Point numbers so now we will do semantic search we will find to we will try to find the answer of this question from the data available in the knowledge base so and then with Grant the top 6 answers which we find in the from the knowledge base you can change this value to one two three so you can just rank find top three answer top four answer but by default it find top six answers that are available in the knowledge base for this question so after finding the top six answer I know in the next step I will pass the user question directly to the large language model as well and you would find the top six responses to the large language model as well and which is llama 2 in this case and the large language model will then generate a natural response and we could get back uh well it will look forward to the user as well so this is the complete architecture diagram of how it works so let's move towards the code part so in the first step you will import install all the required packages so we will install link chain package accelerate Transformers sentence piece and sentence Transformers package because you will be downloading embeddings and restaurant farmers embeddings in this case so we will be using free embeddings and we will be using sentence Transformer embeddings okay so plus I will also be showing you how you can do this with lamba 2 as well as how you can do this with open AI models as well and here I'm installing unstructured package so that I can extract the data from those URLs okay and here we are installing spy CPU because we are using files as our knowledge base or vector database okay so along with this Suite you can also do this with Point blind but in this tutorial we will be focusing on files as our Vector database so here I am just importing all the required libraries unstructured URL loader so that I can pass all the URLs of my website okay and then it will split our data getter from those uh pages in the website into small tanks chunks using Vector text splitter and then I have user open AI embedding so I will be using open a mod on open AI model as well to convert the deadline to embeddings plus I will also use free embeddings from how to increase some weddings as well to convert the text chance to embedding so here I have chat opening fast as I want knowledge space we can also use find one as our knowledge base but we'll focusing on files as our knowledge base and then we have retrieval web Source change so so that this will give us the answer to our question with the source as well like from which website or which URL it has gathered this information okay then we have a green question embedding to convert our text into embeddings okay text chunks into embedding so now we have Auto tokenizer so that we can convert our text into numerical array and then we have face pipeline so that we so there are two ways to assess the model uh for from hacking phase so the Llama 2 is also available on again face so one way is to assess to the model through API I try to assess the number two model to apa but it did not work the second approach is to download the model uh into in your toolab collab or download the model locally so I will be downloading the mock broader gamma to Modern from having paste into this Google app notebook and then we'll be creating a pipeline model then I have this so as sys so why we just need this because I could just want to stop the process I will just use sales package okay then I have osla package so using OS package I will just create an environment okay and uh then we have torch and then we have nltk so I've just downloading nlt2 toolkit so that I can just extract the text from this website okay and here I've just passed the open a API key so that I can just create an environment so that I can use open AI models in this Google app notebook as well so open a models are not free they have some cost Associated between them so you need to have a paid account in open AI uh you just have to have a paid account so that you can assess uh open AI models so that uh now you can see that here I have this passing different URLs so let me just show you this URLs one by one so I will be stacking the data from this URLs and then I will be just creating a chat board uh so it will be able to answer the questions uh it will be able to answers from the information that is available on this urls so first of all uh you can see this URL this URL contains the block one number two this is recently written blog so here we have information about lava 2 so I will be extracting the data from this blog first the next URL is this this is the one latest model MPD 7 bit 7 VM parameters so let me just show you so here you can see this is the blog on latest smart language model mpd7 7B so 7 videos and seven degrees parameters so here we will also extract the data from this uh website and we will do a question answer as well so next is we have this uh log one stability AI so on so here you can see we have a Blog on stability Ai and stable launches the first Australia large numbers model okay so this is the blog one other large language model so we will extracting the data from this page as well and then the last we have the block on the uh also recently launched the legend model we go on as well which is launch and watch not very recent okay so this is a Blog one big one and open source chat board equation gpd4 with 90 chat gbd quality so we will also setting the information from this blog post as well okay so now here I have this pass all these URLs so let's okay so now I will be just extracting the data I'm using a structured URL to load these pages and extracting the data available on these pages so this might take few seconds so you can see that we are able to access extract all the information that is available on each page so here we have extracted all the data that is available on these pages okay so our next what uh we just need to go is we just need to take this text into uh text term so you can see that currently we have four past four URLs so long you can see that for you also we have divided our data into four parts okay so now you can see that we will need to divide our text into different checks jumps because you can see that lava 2 model has an input token limit over input uh token limit basically represent one token is equal to four English character so it has input token limit of 4096 tokens or you can say that what if one token is equal to maximum four regular so it has a in the input uh to the number two model you can pass 16 000 English characters at the input maximum so you can just in by maximum you can pass 16 000 English characters at the input of the gamma 2 model okay so now if you see this extracted text so we have uh more than 16 000 characters English characters so we cannot pass this text practically into the input as an input to the gamma to model so we need to divide this text into chunks so here I have this set a chunk size is 1000 so we will divide this complete text into text chunks which will contain thousand characters at maximum and there will be like overlap of uh correct 200 characters as well so let me just show you so let me just show you the overlap concept as well so if you just see over here lamba 2 model so if you see lamba 2 model with 7 billion and 30 billion parameters out room uh what npt models of small size in all character code benchmarks so here you can see that this uh complete line is also added over here as well okay so you can see that so this is an overlap of 200 characters in this text Chunk and disjunction okay so now I will just download hugging for some baddings okay so I will also show you how you can do this with open a embeddings but currently we will be using question writings which are free to use plus I will be using gamma 2 model which is also open source model and free to use as well okay so now you can see that we have downloaded it an impression button so if you just want to check the embedding size or index size so you can just uh checking to check the dimensions of your embedding so you can see that we have 768 damage as the dimensions of our embeddings okay so now I will be just creating a you can say that Vector database or knowledge base so now you can see that we have converted our text into small chunks like you can see that we have 89 different chunks so we have done till this part so we have created our abstract and we have first we have extracted the data from four URLs then we have uh converted uh the data into small chunks okay so you can see that we have 89 different chunks okay so now we just need to create embeddings for each of the text Chunk and then we will create a knowledge base uh where we can store all this embed so now what I'm doing over here is I am just converting these text chunks into embeddings and creating a knowledge base so here uh after we have downloaded because I'm adding so now I'm just converting those text chunks into embedding and then I am just storing those embeddings in in the files knowledge base so we are just using the knowledge base as twice so we have twice as our knowledge base so 5 is our knowledge base okay into embeddings and storing those embedding to our knowledge base which is twice okay so we are using five CPU as I want back test dot okay so now we have done this so now what I will do is I will just add one output login so I will show to assess the model from hacking phase you need to have your collab connected with it so I will just log in cash dot go from here and I will just go to settings from here and just go to SS tokens from here just copy my SS token and just paste this over here so that's work good then click on login so you can see login successful so now I will be able to download a model as well as tokenizer from hugging face website Okay so Cloud Model okay so okay so I've already downloaded it so I think I just need to uh restart my runtime so that I can just download it okay so no issues in that so just import all these libraries again again so now I will just we'll do this as well you don't need to log in again it works now you can see that we are just downloading the tokenizer as well as number two model on here as well and we are by default the downloads a 16-bit model but here you can see that in my uh GPU app 15gb graphic card I don't have the prescription of Google app so I have 15 GB graphic card available but if you have a paid subscription model so it offers 14 GB uh 40 GP graphic card so uh you can just get the prescription if you want so but I we uh description model so I have to consider my memory usage so I am just using A8 print over here by default it upload 16-bit model so I'm just uploading 8-bit model you can also load 4-bit model but it gets very not good results so I will be using a 8-bit model over here okay so now it's downloading the modem so it's my day few seconds before it is ready so now you can see it's work now I'm just create a byte9 and maximum new tokens uh it means maximum number of tokens in the output one token is equal to four directors in English okay so now here I'm just creating a pipeline and if I just ask uh please provide okay so I'll just change this we will provide up one size summary of the book and chemist so let's see what response do we get from here so this might take few seconds before we are able to get the response so here you can see we get the phone size summary the brief summary includes per dot screen elements and themes uh boiler for you test the scale of Cynthia or a Shepherd who own bikes Our lives our drink Adventure in quest office personal Legend his innermost desire which can be fulfilled only when he listens to hit Earth San Diego sets from the pyramid subjects follows a series of signs and ones to fulfill his personal Legend along the way he means various people would teach him and spirituality self-discovery and preservance when he reaches the OCS of ziorra that he discovers Fatma the love of and he learns about the importance of staying true to one's Earth and spirit so you can just find a summary of the book and dismissed or any other book so now this is just checking our models working so now I will just uh going back to the our original tutorial so now I'm just asking a question how good is so if you just see that I have just passed this URL as well over here about makuna which is this this URL so I'm just asking the question from this website how good is working also because now it can just go below down so in the way it uploads so let me just execute this cell so let's run this okay so I'm just asking a question from this website which is an open source chat button pressing GPT for 90 percent charity quality so I have just extracted the data from this website as well I showed you so now I'm just ask a question as a chatbot how good is go to another and you can see that currently it's running so this might take few seconds before we get the answer so let's wait and as I get the answer I will show you and then we will discuss some further questions and ask some further question as well and see what results do we get so now you can see that I have asked how good is okuna and here is our answer is a large language model that has 97 90 capability of shared GPD but it is as still has limitations okay so you can see that here we have this sign so if you just wrapped attached uh so we signed we removed and we have a broadly clean answer okay so now you can just ask um uh how does number two outperforms other models so you can just ask this question as well so let's see uh what response do we get for this but this might take few seconds before you get the answer so here is our answer number two model outperforms other models in various Benchmark categories particularly amputee Falcon and open source model they are also compared uh they will be to close Source models like GPT 3.5 and form as well so we have the answer so just name this text a bit and show you the response uh for this as well okay so I'm just asking this question now so this might take few seconds before we get the answer now I have just asked what is table LM and here we answer stable element the suit of language model developed by stability AI so it's a clean answer and if I just ask can you please share some details about mpd7 model so it says I cannot provide more information on the MPT 7B model as as the requested article does not provide enough contention information on this topic so in this way you can also do a question answer like you can see over here as well so let me just show you a quick overview how you can do this with open a models as well here you can see that I have just passed my open AI API key over here so now I will just go down below okay and one thing I will change is I will just download open a matrix from here okay so uh either we use this you need uh three embedding so now I've just downloaded opening embeddings as I will also use chat open a models in this case as well okay so this is initialize so let's see what if I record which model it uses so let's see that okay so by default it's using GPT 3.5 turbo model so that's okay okay so if you just go below down and ask this question please provide a concise summary of the book Alchemist and let's see what response does it gives uh okay here I just missed a DOT predict so I just added the dot predict function and here we have the summary The Alchemist is a novel written by Brazilian author Brazilian author boil or coilover it follows the Journey of a young sense bird santigo who sets out to find a hidden trigger after having a recurring dream so here is the concise summary against copy it from here as well so let's establish the chain and let's ask the question so here is our answer bakuna achieves 90 capability of geographies on Earth with their GPT it has limitation in task involvement reasoning or mathematics and may have a limitation factual accuracy but with open air you can see that it performs better than gamma 2 and gives us a lot more detailed answer as well so that's good as well so let's go ahead further and let's ask compare how does number two come outperforms other models or let's see uh what response do we get from here so this will take around five to six seconds more before we have the answer so here you can see that we have the answer how does limit work performs other models number two model outperforms MPT model of similar size in all categories except word benchmarks and they also outperform Qualcomm models in all Benchmark categories the 70 billion uh parameter uh gamma 2 model include achieves improved results on the MMR new and PBS Bank Mass compared to the Nama one model so here you can see that we have a detailed answer over here as well and if I just ask what is stable Adam and stable Alum is an open source model released by stable Tai this model is available in various parameters ID so you can see a quite detailed response over here as well and if I just ask can you please share some details with about MPD 7p model you can know that with lava 2 model it was unable to generate a response but here it is the mpd7 7B model was built on uh but was built by the most likely M MLP team using the tools available to use to every customer of Mosaic ml it was trained on the Mosaic ml platform using tools such as compute a140b 40 GP 880 GP for a from Oracle Cloud so you can see that here we have a quite detailed answer over here as well so now we can do a complete question answering over here as well so if I just write our word is so let's see what response we get from here so this might take a few seconds before we are able to get the spawn so here you can see our compute chatbot working a summary limitation it is not good at toss involving reasoning and Mathematics however after fine tuning which here Deputy confirmation because becomes capable of generating more detail and valid answer uh oh outperforms other models so let's see what responsible we get for this and this might take two seconds before we are able to generate the response or let's see okay yes number two outperforms other models in various Benchmark categories except for Port bench mods lava 2 model is 7 billion here we have a detailed response okay so if I just write exit from here or let's ask uh what is okay but let me see the last question I am just asking what is table LM and after this I will be just writing exit and I will be able to exit there is no relevant information uh okay if I just want to answer how big is table LM no so let's see okay so let's ask um can you please share some details about mpd70 model okay so on let's see what response do we get from here well this might take two seconds and let's see the mp7b model was playing on the music ml platform which is dual type such as 840 GB 880 GP gpus from Oracle Cloud output compute so it here we have a clear answer you can slide exit from here we are just exiting so that's all from this tutorial I hope you have learned something in this tutorial see you all in the next video till then bye
Info
Channel: Muhammad Moin
Views: 4,230
Rating: undefined out of 5
Keywords: Llama2, Llama, LangChain, OpenAI, chatgpt, ai, artificial intelligence, Large Language Models, Generative AI, LLMs, chatbot, chatbot using python, chatbot tutorial, deep learning
Id: M4G2zk4X93Y
Channel Id: undefined
Length: 31min 38sec (1898 seconds)
Published: Sat Jul 29 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.