FREE Local RAG Chatbot with Ollama - Streamlit and Langchain. Build with open-source Mistral ai

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everyone now you can build rag application using AMA embeddings with better performance compared to open a embeddings for this application we are going to ingest data from a URL convert those into embedding and store it in a vector data base so when we ask a question a relevant data will be sent to ol and finally we get more relevant answers at a faster pace so for those who are not aware of ama ama is a tool that allows you to run open source llms like llama mistal locally on your machine so your data stays with you and resulting in a faster significant response with all your data being processed locally you maintain the complete confidentiality of your data making it an ideal choice for those who prioritize privacy so today we'll be using open source llm mistol and an embedding model from AMA to build a rag application for this application as I mentioned as you see the process we we take a the URL and then and converting them into embeddings and storing them in a vector datab base and using AMA we use the open source llm Mistral and also the embedding and we get a response so that's how this whole application works now let's also understand what kind of models can be accessed from AMA so here are the models what you can access from AMA all are open source so there Gemma Lama 2 Mistral lava Cod Lama as well and also few embedding models so today as I mentioned we'll be using mistol 7B model released by mral a updated version and this apparently outperforms llama 2 13 billion on all benchmarks outperforms llama 1 34 billion on many benchmarks and we will be using an embedding model known nomic embed text this apparently has a l larger context window than open a embedding so it's perfectly suitable for for people who are who would like to build Advanced drag applications so that's about AMA let's quickly jump how can we download AMA and how can we build applications using AMA first you have to start by downloading AMA locally on your computer over to AMA and click on download it's available for Mac Linux and also windows so once you downloaded AMA head over to your terar and say AMA run Mist if you want to use llama you could also say Lama Ama run llama and then click on okay I've already downloaded so I won't be doing that once you're done with AMA run Mist trial just sayama pull M this basically pulls the model so once you're done with this we'll be installing all the necessary libraries as I mentioned we'll be building this application using streamlet web UI so we'll be um using streamlit Library over here and we'll be also using longchain Community tools over here for example splitting our text into chunks and we'll be using chroma DB for storing a vector database I would just say pip so these are the libraries you need to install longchain longchain Community longchain Cod stream L chroma DB and TI token for our application so it says requirement already satisfied for me because as I've already installed this so once you're done uh with install ing all the necessary libraries and cooling your llm using AMA we create a new file called app.py so start by importing all the necessary functions and modules as I mentioned we are importing streamly and from longchain Community we be just using web based loader chroma DB and EMB bearings you know from longchain Community LMS we'll be using Ola and characters to split our text into chunks so these are the model modules you need to import once we are done with importing all the modules we'll start writing the code so we'll start by initializing the model and also uh write a function for URL processing so just URL process find process input so what we are doing here you're basically creating an instance for AMA model named model local and specifying the model type you want to use we are using M over here and this is a function we set up to process our URL so that users can enter their uh URLs and ask a question so we have to continue with the URL processing we'll be using webbased load up Webb loader is nothing but it's used for loading documents from the web we would want to uh users to use multiple URLs and we'll separate this with the line so just say URL list so what we are basically doing over here is loads document from each of the URL that a user has entered in the list and using webbased loader we are combining all the documents into a single list and we are using a variable called docs list and that's what we are doing here and the next stage basically consist of three steps basically converting those document text into chunks and then convert those into embeddings and storing them in a vector database and Performing the R so the first step here is split the text documents into chunks so we'll be using text splitter over here and we'll be using the function called character text spitter this is from L Community tool and you can mention the chunk size and also the overlap and the next part is storing it in a variable so we'll be creating a variable called doc split text splitter and we'll say split documents using doc list yeah so once we are uh done with splitting the documents into CHS we have to convert those into embeddings and store it in a Victor database so we'll say convert into embedding and store it then we be so first we'll start by creating the vector store so here here we'll be using chroma for Vector store chroma from documents and the documents are nothing but the dock splits what we have mentioned about and we'll have to mention the name we'll mention collection name is equal to R chroma and now we have to convert that into EMB bearings so just say EMB bearings we have to do that from ama ama bearings and the model we'll be using over here is noic em embed text apparently it is it has a better performance than open air embeddings and it has a larger context window so it's perfect for building Advanced drag applications embed text and will you be using a retriever for a vector storage so what we have basically done over here is create a vector store from coma and then you know converting those chunks into embeddings using AMA embeddings and then uh retrieving those from a vector store now the third step is to perform the rag we'll be saying after rag template this is a variable I want to like to Define and the system message over here is answer the question based only on the following context we'll we'll mention what is the following context below so what we have done basically over here is set up a chat promt template for the model to generate answers based on the context provided by the retriever and then the processing chain is defined to pass the context to the question to the model pass the string output and return the answer so the chat prom template we again using from longchain Community tools and then the context is nothing but what we are retrieving using a retriever from a vector store that's it that's basically three steps what you have done over here split the document into chunks convert checks into embeddings and store them in a V database and perform the r once you're done with we can begin with streamlet which just that you if you want to see this application on a web interface we'll be using streamlit over here streamlit UI so we'll set up the stream L UI with title the title is may be document query with AMA and we'll write a bit of description so you use std. write and streamlet enter URLs maybe simp mention one per line and a question to query the document and we have to provide the text input so that the users can uh input their URLs UI for input Fields URLs so we'll provide an area where users can and then we have to set up a button process the input this is the submit button so we'll say if sub do button is quy that's it that's it the button is set up we'll also use f. spinner so it's basically a validation clue for our user that their uh input is being processing and then we have to finally set up um the answer which is you know processing the input if the URL and the question and in a text area we will see an answer so that's very simple UI just you know having a title and a bit of write up providing two input field for our user so that they can enter their URLs and a question and just having a text area at the end so that the users finally can see their answer so that's it that's a very simple rag application using AMA so once you're done writing the code just enter stream lead run and the app name m.5 py over here and then you'll have a link for your Local Host click on it and you'll end up here so this is how your web application looks like um basic title document query with AMA so all you have to do is start entering the URLs you can enter as many as you want so I'm here entering two URLs Wikipedia pages of Bill Gates and Jeff BOS and I'll start asking question who is Bill Gates for example quy documents is processing so you see a very detailed info on who is Bill Gates all from the Wikipedia page what we that's a very simple rap application using AMA and open-source model mistal all in your computer so that your data is is not being shared with anyone which is really cool so if you would like to see more such use cases running llms locally on your computer please do let me know in the comment section I'm excited to share more such tutorials thank you for joining today if you like this tutorial please consider giving it a share and subscribe see you in the next tutorial
Info
Channel: AI Product Builders
Views: 9,057
Rating: undefined out of 5
Keywords: datascience, generativeai, dataanaylst, ai, aiproduct, chatbot, llms, aitutorial, python, nocode, rag, ollama, streamlit, langchain, Mistral, opensource
Id: kfbTZFAikcE
Channel Id: undefined
Length: 10min 51sec (651 seconds)
Published: Fri Mar 08 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.