Data Engineering with large language model || Streamlit llm chatbot || Chat with private Data

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone welcome back to my channel and today I'm going to show you how we can do data engineering task using llms and as always I will be using open source llms this time we I'm using mistl 7 and the whole program will be executed on my local machine which is having no GPU and only 16 GB of RAM so let's start with the demo but before that if you like the content of my Channel please like the video and subscribe to the channel let's quickly dive into the demo of this over here first of all we have to upload a data file I'm picking up this employee file let me show you the employee file before so we have employee ID first name last name phone number and few other informations along with SSN nationality and all these we'll be uploading this file and then try to do uh data engineering task on this file let's go back to the user interface and over here I'm asking it to split the salary column as 10% H 70% basic and 20% allowance so here is the output over here we can see it has returned the data frame from our uploaded file and if we go to the end we will see that three new columns have been added which was not part of the file over here we have H which is 10% basic which is 70% of the salary and allowance is 20% of this salary let's put up another question here we will ask it to mask the SSN number that is there in the data if you see over here we have the SSN number and it is a private information so we want uh the tool to mask it and leaving the last four digit so we have put it like star and then 1 2 3 4 let's see what it does so here is the output we can see the SSN number as we have requested the leaving the last four digit it has masked the rest so this is a very useful tool and we can call it a data engineering chatbot so that's it with the demo let's start understanding the code behind it so we have basically two files one is for the user interface and before we go into the details of the code the code is available on GitHub and the link for the GitHub is available in my description of the video you can check it out you can try it on your own and do more things and comment in the comment box what you have done share with share it with others so over here I'm basically this is the standard stream lit code uh for creating the chat interface we are asking to this this section is to upload the file so the uploaded file is saved in the input folder after that we are just passing the so I have imported the main file from which is the this one I will come to it later and uh over here I had a function data chat or rather class data chat within which I have another function which I'm calling over here um right here so I'm calling it with the path of the file that the user is uploading now coming to the main file over here we are having the class data chat within which we have the initial function which is nothing but to you know Define the model name and the model parameters like the temperature or the context length and all this then we have the prompt so we have the um instructions defined over here what the a large language model is supposed to do and after that with the user question we are putting it to the um llm and um the response that we get is not exactly let me show you is in the format which can be utilized so this is something that we get or or maybe in the previous one the queries that we put this is something so we have quite a few things mentioned apart from the code so we need to extract the code and execute exactly so for that we have this function where based on this uh pattern we are extracting the code and after that we are if you see we are executing the code we are doing a fixing of the format of the code using autopap eight8 which is a fixer in Python and then we are executing the code this way we are uploading or we are operating on the data frame that is captured from the input data file quite straight straightforward and U easy to implement so let us know what you try what you do with this approach or what else you can find out that's all for today see you in the next video thank you bye-bye
Info
Channel: Joy Maitra
Views: 192
Rating: undefined out of 5
Keywords: llm code generation, run llm locally, chatbot using open source llm, chatbot using python, llm tutorial for beginners, local gpt, local llm, mistral ai, run open source llm locally, open source llm, code generation with llm, genai
Id: RmAhlQDwVmo
Channel Id: undefined
Length: 6min 39sec (399 seconds)
Published: Mon Mar 11 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.