Connect ChatGPT to your Data in Databricks and MlFlow

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
we all know that it is possible to connect your own data to open AI models like chat GPT and start chatting with your own data on the other hand we have databricks which is a unified data platform that can even support deploying our machine learning models through ml flow and here's a question for this video can we create a chatbot that uses opening models like chat GPT to chat with our own data and deploy this chatbot as an endpoint using ml flow in data rigs then let's go hello everyone this is mg and welcome to another video we previously recorded multiple videos about how you can use open AI models like chat GPT or GPT 3.5 or gpt4 to chat with your data and you can connect these models to your Enterprise data and start asking your questions about your internal data and now here is the secondary question that I have for this video which is how can I create my own chatbot similar to chatgpt that can talk with my own data and deploy it as an application in databricks as we all know databrick supports ml flow that means you can easily deploy your machine learning models through ml flow and create an endpoint for your machine learning models can we create an endpoint or deploy our large language model or chatbot application through ml flow this is exactly what we're going to check in this video that means you can now deploy your large language based application or your open AI models that can talk with your own data create that as an app deploy them and flow and start interacting with it through an endpoint then let's check it out before we start make sure you subscribe and hit the Bell icon so you will get notified for the next video thank you all right welcome back everyone so in this video we are going to actually enable a solution accelerator developed by data Peaks to help us create and deploy a chatbot in databricks using mlflow to be able to chat with our own data so this is the GitHub repository that contains all the codes that we need for Recycler to create this chatbot in our own databricks environment and I will add the link of this repository to the Discord channel on the reference section so I'm using that to sort of host all the references of the videos there so later on if you want to check any reference any code of any videos I have recorded you can just simply search in that Discord Channel and find it out so in order to actually use this GitHub or accelerator in your data bricks you just simply need to come to your databricks workspace and under repository you can click on add the report under here you just need to Simply copy and paste the GitHub repository that I just showed you and again I will add the link of that the Discord Channel when you paste it here just click on create repo and that's it this is what you will see another name AQA bot this is all the codes coming in from this repository we just talked and we are good to go with deploying this code and create this chatbot in our environment well it's very very easy to deploy this in one click I can actually create this chat but using email flow but I want to go all the steps and codes that this accelerator used to make sure that we do understand how this accelerator works and then leverage it for your own use cases okay so in order to run this end-to-end accelerator you'll see that there's something called run me if you click on that you will see that it sort of provide you some description about how you need to run this what are these steps are but longer story short first it is installing some specific packages and importing them and second you need to specify what is your open AI key in order to for example leverage your GPT 3.5 model that you're paying for in your opinion account right so you need to add them here and this is the place you need to paste your API key for open EI because you want to use open AI models as your large language model to chat with your own data you can change the secret name as well but I just got it as is I had mine I removed it it's credential but this is the place you need to add yours and then moving further this is what is happening here I am defining a workflow in database what is workflow considerable like a pipeline that has multiple steps then what are those steps each step is a databricks notebook what are those notebooks here it is I'm saying that I want to run a workflow for example this is the first task which is a notebook in the current working directory I want to run it so this is the first notebook as my first step for this workflow where is it if I open my ripple there you go the intro is here so first we want intro and we'll check the codes then I have another code which is a symbol application then I have another code here building document index then I have another notebook all the way going down evaluation and lastly deploying application these are exactly these number one to four notebooks that I'm calling them as a workflow to run them over this specific cluster okay that's it and I'm just saying that run this as a workflow using these configuration when you run this this gave me this link and I realized that awesome we just really One Click Of course I ran all the readme file sorry run me file that I'll just read the notebook and when I executed the last cell you can see that it created the workflow running all these notebooks that I showed you these ones as in a step here and then you just click on run now you are done what you will see when you go back to your workflow you'll see that under job runs you'll see that yours is running and after 20 minutes almost mindset succeeded and I click on the job let's check it out there you go all these steps and notebooks that I showed you successfully executed in a sequence that was added in the job description now what I'm going to do I want to go over all these step by steps to see what is happening inside them what are the notebooks and what they're doing and lastly check out this chatbot together okay what is the first one let me open this you will see that this is just an uh explanatory notebook to just tell what we're gonna do so long history should what this accelerator is using it is actually grabbing documentation of data breaks documentation of a spark and the Forum of data bricks that has the questions and answers of people about data bricks and Technical questions and stuff so database for this example gathered all those info to create a chatbot and then you can go and chat and ask questions about databricks documentation and Spark so assume that that data can be your data that can be your company data a data that is not available in Internet and charge you but they cannot answer that because the data is not available but now with this technique you can start creating a chat but that answer questions based on your data so here databricks documentation is used as an example that can be replaced by your data so with having that database in place and we create embeddings on the top of that then we can start asking questions by the model large language model and the answer is there for us so this is sort of The Logical what we're going to do it is running the notebook config to grab my open Ikea stuff that we talked about then that's it it's not really doing anything specific it is just all about introduction of what this accelerator does which is just what I explain you then what is next if I go back all the way to my run the second step is this notebook so what we're gonna do here let's check out now if I move down it is installing some packages like open AI Lang chain these are the stuff needed for tokenization creating the prompts connecting to large language model creating a vector database with face and doing some semantic search and some other specific leverage that is needed for executing this accelerator then importing all those packages we just installed coming down here is the raw data that databricks use for this accelerator which can be your company data or any internal data something that is not available on internet as an example but here for this example you can see that I have the text about spark data breaks if I scroll right you'll see that I not only have the text also plus the source of the text so vertices coming from from again the spark the Apache spark documentation with the relevant text beside it okay now what this accelerator does it will convert this to a table a Delta table now I have it with the 7000 rows and you can see it shows is a chunk of text now what we're gonna do in order to answer questions about this data we have to create index of this data because we cannot bring all these data to chat GPT as an example to tell that hey answer this question based on all databricks documentation no we have token limits so what we're gonna do I in details explain that how you can chat with your data and create embeddings and stuff to a separate video I will add the video to the top right up this screen check that out but longer story short we need to convert this text to embeddings what are embeddings they're just numbers that are understandable by these large language models for example I give that text I generate a bunch of numbers out of this text so when a large language model see that text they can understand about the context now then later on I ask question what is a spark based on embeddings of my question and the embedding of this text it will understand that mg is asking about and Spock overview so it will retrieve that chunk of text for me as the answer that's why we need to index our data and then create War embeddings how we create body meetings or how we convert these texts to those numbers again we use opening models so moving down we were actually here I'm saying that this is the raw input that I have we have the text we have the source and sometimes these texts can be pretty long for example this code actually is giving me the longest text in the data frame if I scroll all the way down look at that it's just one row of that table so it's a huge text and of course we cannot even if I want to pass one of these chunk of text to the model that can be long enough to give me the error of hey you passed the token limit that's why I need to convert these texts into chunks with fixed size of chunks and you might say that mge if I split my data to chunks I might miss the context that's why we do overlapping so a repeated portion of chunks will show itself in more than one chunk to make sure you're hosting the context and we are not missing any specific value or story there but the challenge is what is the best chunk size for me and what is the best chunk overlap size quite honest there is no direct answer for that and as databricks mentioned it is like art than really a science so for this example I saw that databricks used the chances of 3500 and overlap of 400 but based on the type of vertex how much of overlapping is needed or not needed and what is the limit token of your opening model if let's say you're using gpt4 that's another scenario after or if there are more GPD models coming in with a higher rate of token limit these numbers will get changed so I would say start with as is but this is a there's a place and there's a obviously there's opportunity here to play with these numbers to make it further robust now now we know that okay let's say this is the trunk size and the overlap size here is a user defined function to convert those raw inputs with fixed chunks so do not think that long text which gonna cause us total limit this is the UDF user defined function that received the text and split it and then based on this chunk size and overlap it will give me the chunk data which is this one you can see this is one trunk the text and where it's coming from here's the source now these chunks are ready to be retrieved by a user question using chargeability as an example right for example if I scroll up I want to ask what is the first text Sparkle review I want to ask hey model give me an spark over you what is Apache spark okay so the model should go all the way back and retrieve this one based on warning biddings it should understand and I'm asking about this chunk but where do we need to host these text files and chunks how the retrieval gonna happen that's why we need to create a vector database because we want to converse these text to embeddings which we need vectors to host numbers so your vector database can be redis can be a new Vector database that Azure Collective search announced or I think Azure data Explorer actually support Vector search and vectors storing vectors but here for this example database is using phase which is I think created by AI research team in Facebook so it is using your disk to have your featured Vector database as an something internally located in your disk that databricks is interactive with it but quite honest that can certainly be replaced with anything that you have in let's say Enterprise so it is converting all those chunk values to upon this data frame and then with it is splitting the chunk text and the metadata of those chunks I mean I am splitting the text and the source sources metadata why because I want to store them to a vector database and there you go it is actually talking about I think Victor face which is a vector database used developed yes by Facebook research and you can see here that it is used open AI to convert those text to numbers which we call it embeddings and it knows which model which token and stuff we add that in the configuration then this is how you enable face you say that hey this is a vector database here are the embeddings and here's my inputs we generated here and here's the source of those inputs metadata they generated from here now I can save my Vector store to a local path in databricks dbfs file system or disk again that can be out of human database that's it now we've created a vector database we converted the text in embeddings and now the embeddings are indexed and chunks in my Vector database which I'm using face now what is next let's go back to my run I think we finished the second step right yes we are in the assemble application step let's see a step number three what happened now we are here we have a vector store we have the text and the URL of that where that text is coming from which we call it metadata and now we are here to start asking a question and we need to bring the relevant text from Vector store to answer the user question by open your model and then answer the question okay let's do that again some packages need to get installed importing those coming down let's say here is a question that I want to test how to register a model in databricks although these information is available on internet maybe you can use Bing search to answer this question but again this is an example that can be your internal data for example you have a process of opening a support ticket in your company which is called company X so the question can be how can I open support ticket in company X that info this is not available and then I'm saying that for answering this question that question might be available in more than one chunk of data so maximum bring the most five relevant text sources that I have in the vector store to answer that question that can be any number you can specify and there you go for example I had just this question I asked and it says that what was the question how to reduce the amount of databricks there you go it's talking about ml1 how to register modern databricks right okay now what we need to do we need to assemble and create like a function or a wrapper function to have that end-to-end process give me a question and here's the answer this is what we're gonna do but of course we need to create a prompt for example The Prompt should be hrgbt you are an assistant for mg and mg going to ask some questions about his company internal data and in order to answer that question you need to go back to this phase Vector database bring the relevant sources and here are the sources for example top five sources take a look at them answer envious question based on these five sources that I brought for you you see I am sort of defining an identity for my GPT stance model that what you're supposed to do and how you're supposed to answer the question but do I need to prompt this all by myself the good news is no because that's a very common pattern of creating a chatbot to chat with your data the Lang chain develops on prompt templates and databricks accelerator is using those prompt templates to generate that template so you can see there are two type of templates system message template human message so what are the differences here it is the system message prompt show instruction to the model about how we want to respond and human message template provides details about the user initiated request so if I scroll down with having this loaded from Lang chain and of course telling the language in what model I'm going to use to answer I'm just saying that bring large language model chain based on this prompt template now I know what are the prompt and what are the model that I'm calling to generate the answer how here is what I'm going to do so I'm saying that go ahead to all the docs that I grabbed from the source to answer mg question get the answer from the the source and if the answer is not known that means it was able to retrieve the answer then print it how to reduce the amount of databricks here's the answer so now we want to deploy this model to to create an endpoint not certain model deploy this application to create an endpoint for example I give you URL and I say that if you have any questions about mg's data use this URL send your request to http call or wherever you are and get your answer back that should remind you ml flow right this is exactly how I deploy Machinery models in mlflow your train your model tests your model validated log it in ml flow and deploy the ml flow then mlflow will give you uh the integration of ml flow and databricks will give you an end point that you can call it and have the predictions back this is exactly what we're gonna do but don't forget we are not deploying open a model here no is not a model that I have trained so I'm not going to deploy and I cannot even deploy it in my database I've deploying the code the application that I have to interact with open Ai and answer my internal data question then how can I deploy this you need to Define your application in the class right your chatbot it is just simply saying that they give you the prompt and retriever process which is the face Vector data store and the large language model that you need to talk if you see any of these abbreviations in the text like TBR it might don't know what does dbr mean so I'm giving some heads up these are the abbreviation and I want to make sure the answer is valid or not so if you see I don't know not clear sorry no answer that means strategy was not able to answer that's a bad answer then or if it's a good answer then get the answer from it and make sure we are under the timeout limit and get the results if if we have that as a good answer then provide me the answer with the highest rank in the source I mean based on the the most relevant chunk of text and number one that we got out of that five answer it and if the answer is there retrieve the back results for me this is exactly what this wrapper code does I just explaining it very fast but check that out the details you can modify change it some of them are very relevant to this accelerator I think you can remove them change them or even this part can be further modified or even more simplified or even get more complex to add more scenarios okay now when I have this class I can just simply call it and get my answer the question was how to reach some Modern databricks and the answer is back just through calling this class which I defined it on the top here now I feel I'm ready to persist that solution to ml flow then the rest is simple we can have a python function rated register this application as an ml flow endpoint to actually deploy a later on as an endpoint and I'm saying that this is the class I wanna persist to ml flow and this is the class created for creating a wrapper for registering that under ml flow and then with having that wrapper in place I can say start ml flow log whatever I created on top what it's a model but not open-air model that's my chatbot that I am logging actually I'm processing terminal flow here with some package requirements needed some configurations and I'm done if I go to let me actually check if I go to machinery models There You Go version one created this is my chatbot wrapper that use open AI with all the Logics I specify to actually interact and ask questions about my own data if I click on Source run it will open the note with that I just show you so if I go back again this is models if I go to experiment this is experiment related to this accelerator I created actually so I click on this one this is the latest one and only one I created if I click on this there you go so you can see that pickle file is the class that we talked about which is calling that opening or large language model with all the class we specify to retrieve the most relevant quests a source of data from Vector database and then generate the answer and you can see that these are the requirements we have specified to be installing open AI Lang chains on and so forth so getting back to my workflow and that's the job that we specify click on that okay we finished this step assembly and now we want to evaluate this ml flow wrapper that we actually configured so click on this step now for the evaluation here's what I'm gonna do I wanna see how this solution is accurate for answering someone's questions about documentation that we indexed them for example if someone asks what is databricks making sure it is not going to answer Apache spark is No it should answer first what is databricks so how we can do that well you need to have a subset of label data for example here is a question that I ask what is databricks and here's a sample answer databricks is a unified data platform that support this and that and that so you have to you need to have that labeled data then you give it to the solution and this solution will answer what is databricks question and then we compare how close the answer of this solution is to the labeled answer that we have but then how we evaluate we're going to use LINE chain so let's check that out so here I am importing that sort of ml flow wrapper that I had actually logged in the previous step and then here database accelerator is using a label data for example what's the difference between a spark and this cache in databricks spark cache and this cache and here's the answer inside this label data now we're gonna give that question to this solution that we implemented to see if the answer is close to this one or two but how do I assess the answer is close to this one or not for doing so it first scroll down it is using Lang chain evaluation called eval chain what it does it will actually use an opening a model by itself or a large language model to check how the answer of your solution is close to the answer of your label data if it's pretty close it says it is correct if not it says that it's not correct so it implemented this logic and checked that out here's a question here's what our solution that we just created through this accelerator insert and here is the answer from the label data and how these two are similar well the evaluation methodology by Langston say that these two are sort of pretty close and true I can't see most of them are traditional we have a false one actually let me see why it's false so the and what is the question okay because we had no prediction interesting I want to see if we have more false examples okay it seems they're mostly true okay false false what about this one okay what's the difference between dvfs and UC okay let's see first what is the real answer the correct answer hopefully I didn't lost the row I think I did let me scroll up there you go this one so if I go here here's the answer dvfs is there Rich Foster of course we know and Unity catalog that's actually UC so Unity catalog is a whole different approach for data governance system so these are totally different but let's see what was the answer of our solution which is false if I go right there you go I don't see any information in the context of okay so it was not able to fetch the relevant data from here so the answer is not even close to the actual answer that's why the evaluation is false so you see what's that approach and what's happening and now we can just simply understand in general what's the percentage of true true scenarios 91 of the time not bad right that's it now I'm going to deploy the solution through ml flow and create an end point out of it so going back to my whole workflow we are in the last step right here deployment this is exactly like how you deploy an endpoint for your machine model when you have it in ml flow that's it you just grab that latest version of your original chatbot and we push it to production you see that I am getting to the area of llm Ops or mlops for large language model applications now you have ml flow you have stages evaluation div staging broad that's a whole different topic by itself and if you're interested let me know we can have a dedicated view just about llm Ops which is a new trend of how to apply and customize ml labs for large language models but you will see that I'm giving you some heads up this is what we're gonna deal with and now for having that model deployed you need to specify some configurations for that of course you need to have the the name of that model what's going to be the version you want to have that and deployed and you know that this wrapper needs my open ikn circuits that can get from the config notebook and remember when we create an endpoint we can navigate the traffic because we can have multiple instances running over one endpoint to navigate and dynamically allocated traffic here we just have one system we're gonna deploy so 100 of the traffic goes to that endpoint then here are just helper functions how we create an endpoint if that endpoint exists and just update that one if it doesn't exist and create it we can just use it as a reference and then that's it it will start it for me it was waiting to make sure it is endpoint is done took five minutes and the end point was ready so it gave me the URL of my endpoint if I click on it which will give you give it for you as well there you go now you have a real-time endpoint listening 24 7 for any questions to get it answered if I click on that use model for inferencing there you go oh sorry if I go back to the registered model again this is the model that I have this is the endpoint there you go now for creating this endpoint you can see that you can test the endpoint here or that's how you curl and or how in Python it can submit your question about your internal data and get the answer back here's actually an example that how we threaded in this notebook first go all the way down this is the last step of this workflow you'll see that it's using the same helper function that hey this is the URL of my large language model application give me your question and get the results back there you go here's my question what is for example this Doppler Plum what is the request and the answer is back I have actually a notebook that I can test this even further if I go to lln there you go to play application it's already done that same notebook that I just showed you and check that out I can't even change my question to say for example How can I create a compute cluster in theoryx okay so it will call that model endpoint that I created real time let's see what's going to be the results and it is calling of course my endpoint that we've just created still running there you go to create a common cluster in database follow these steps so to wrap up what we did through this workflow end to end I'm going to show you again that we deployed this accelerator provided by databricks we went through all these steps we had some notebook configurations to add our opening Ikea stuff then we started indexing our data here we documented data breaks and Apaches part of indexed at that then we use the vector database to host our embeddings and we used phased for hosting those and of course we used openai to create embeddings then I had the user question get submitted and check what is the relevant source of data for my user question and bring it to the prompt how we did it using the prompt template of flankchain and using phase to calculate the similarity and bring the relevant data back for me then with having that relevant data submit to the Open Air Model we were able to get the answer back and we had to evaluate how good our approach is and we use launching again to evaluate it and lastly we deploy that as a upper through ml flow to be able to have a real-time endpoint to call this have the answer back that means if I give you my URL endpoint to how to call and if you have permission to authenticate then you can call my endpoint if you have any database question push the question to that endpoint and get the results back real time now have this endpoint integrated in your team's butts in your slack channel in your website anywhere you can have that chat Point embedded as long as you have that endpoint URL and add it to your application code so think about it how we can use this accelerator to start developing your own customized solution and your own internal data using data rigs ml flow end to end and of course your own customizing Vector database if needed and I hope you enjoyed this video make sure you write down your comments in the below section I want to make sure I'm reading them all respecting what you'd like to hear the most so I will get the next content prepared for you we'll see you in the next video [Music] Revenge strong people forgive intelligent people ignore dream big my friends believe in yourself and take action next video take care [Music] foreign
Info
Channel: MG
Views: 2,653
Rating: undefined out of 5
Keywords: open ai, artificial intelligence, machine learning, openai chatbot, open ai chat, openai chatbot gpt writing code, openai chatbot demo, chat gpt explained technical, open ai chat gtp, Azure Open AI, Open AI in Azure, Azure Open AI demo word embeddings, chat gpt, advanced tutorial chatgpt, chatgpt advanced guide, Databricks with Chat GPT, Databricks with Open AI, Deploy GPT Chatbot in Databricks Mlflow, Open AI and ChatGPT with MLflow, LLM Mlflow, LLMOPs, MLOps for LLM
Id: KhpjBa8djMk
Channel Id: undefined
Length: 35min 17sec (2117 seconds)
Published: Tue Jun 13 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.