Scikit-learn + ChatGPT = Scikit LLM

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
foreign these are my prompts for NLB tasks for example this is the prompt that I use for sentiment analysis this is the one that I use for multi-class classification this is for topic modeling this is for text summarization and sometimes when generating water embeddings on the downside and you don't necessarily need all these prompts at all for using chat GPT you need prompt and then you need to parse the output of those prompts you should say goodbye to those all the school cycle learn days that we could use model.fit model dot print predict and boom the predictions are there now you need to prompt them and parse the output that's the whole story my friend well as of now scikit-learn is integrated with llm that means you can still use model.fit or model.predict but I still use chat gbt on back end without you prompting or parsing the output of the prompt what was the package name sakit learn llm and I don't need to use the prompts anymore no we don't necessarily make these problems well on backhand it tells you some problems for you you just need to just type let's say model.fit and here is my sentiment analysis scenario and your input and boom the output is there very similar to I think our neighbors burning their food we do have a fire alarm system here oh I should get to the balcon [Music] oh [Music] I hate your video ideas don't try this one let's go hello my friends welcome to this video which we're going to talk about a Nissan open source package called scikit-learn llm but you can guess from the name that we're going to combine those pretty familiar syntaxes that we use to work with scikit-learn this time for interaction with large language models namely open AI models and chat GPT so now instead of you writing down a prompt for doing stand-up analysis or any NLP tasks with your given data set we could still use this new package that will handle all the works on backend for us the prompts parsing but we just need to say model.fit model.predict so very similar to what we used to have with cycle and this time we can do the same thing do the same syntax and similar coding but on backend we are totally leveraging a new technology which is now large language models and open year models so let's check it out how we can enable it in our workload and let's give it a try so let's check it out before we start make sure you subscribe and hit the Bell icon so you will get notified for the next video thank you all right welcome to scikit llm my friends well that seems to be a very interesting combination Having sakit learned somewhere and now having this large language model capabilities so let's see what is the proposed value of this new library and what we're going to achieve by this so although this I didn't see that in the official dark condition of the scikit-learn but the functionality of syntax are pretty similar remember that so I can learn we had lots of easy to follow syntaxes like model.fit or model.train model dot predict so we're going to use the same syntax but this time the backend model is not a model that we are going to train or from scratch the back-end model is like open air models for example GP 3.5 or the chat GPT so think about this some of the pretty well known text-based use cases for example doing sentiment analysis or doing categorization let's say you have top multiple labels that you want to do topic modeling what's the the topic of or the title of this given comment given survey given text whatever or if you want to do text summarization or if you want to do text translation or generate embedding out of the text so for this sort of pretty well known and structured NLP use cases you might think that I can still use open AI models like chat gbt yes you can but as we all know in order to use chat gbd for any NLP tasks you have to specify in your prompt what you are going to do for example if you have received survey results and you want to do sentiment analysis to check if you're negative or positive you have to write this down in a prompt to the model that hey I'm giving you some survey results some reviews some comments you have to do sentiment analysis for me and tell me if it is positive negative or neutral defining that prompt in a clear man is one thing by itself is a challenge and when you get the output of that prompt or the output of the model the model should say this is negative this is positive you sort of need to parse the output or even if you use functions you have to make sure you're defining the structure properly you're defining your prompt properly to make sure you're not going to the loop of parsing the output how one by one you push your data to the prompt you get the sentiment analysis out of it by open air model so you'll see that there are some certainly workaround and potential challenges as we all have faced up if you have developed such a thing for your large language model applications but today with scikit llm you don't need to write any single prompt for all these NLP tasks I told you you feel like you're coding in psychic learn but on back end it's not your model it's child GPT without any problem let me show you how this is my python notebook you can certainly use any notebook any idea you want just see to make sure that you have python environment and for start just say pip install scikit llm that's it then I scroll down here I am specifying my open AI key I'll I'll definitely revoke it after recording this video so make sure you you keep them safe since they are confidential and then the second thing is your organization or it seems to be an optional thing and how you get that if you go to your opener.com you log in there you go your profile under setting you will see the organized organization ID of your opinion and of course you have your Azure open AI key or opening a key here then on the top of that what I'm going to do I'm going to start with zero chart zero shot GPT classifier so here I'm not providing let's say any few short examples to The Prompt stuff for writing out that hey I want to do some Central analysis stuff nothing I just have a data set and that data set is coming from um SK learn llm library that means Also let's see what is this data I printed X and Y you'll see that these are some commas I think these are something I I didn't really read but it seems to be a survey or a review of people about something not sure maybe yeah maybe it's some movie reviews and on the wire label I have labeled equivalent risk even comment for example the first one is positive positive negative neutral so I have three labels right so I grab this data that can be your data for sure that's just an example and I'm saying that I want to use zero shot GPT classifier to use my chat GPT or GP 3.5 from open AI to have it as my classifier and I fit it to my data example see it is exactly similar to what you used to have in scikit layer like fit dot predicate stuff and that's it now when I have this sort of clf fitted with my data I can use it to predict based on new data so in this example I use again the same thing that I have with their X and if I show you when I printed my predicted labels there you go I have them all predicted not only that now I have I just type it by myself I typed a good review and a bad review and I used the same model and I said dot predict given these inputs it told me the first one is positive the second one is negative so you feel like it's like a model that I trained by myself using Cyclone but it's not actually I'm using uh on backhand open air model and you don't see any sort of prompt stuff parsing the output of the prompt pushing them back to a data frame on back end this has been all configured for me by this package I actually went through uh their source code and I found that in the prompt folder they have in the GitHub repo and by the way I'll add this GitHub repo and all the notebooks and codes to the Discord Channel which the link of the Discord channel is under video descriptions down below and when you go to the Discord Channel you'll click on the reference section and you will see the link of that there and I clicked on templates and I saw that these are sort of the prompts they're using um for let's say zero shot some other few shot examples we're going to go through that actually so this is basically what it is abstracted for you you don't see it it's on backend but you just go ahead with something similar secular okay moving further another example now this time what I'm gonna do let's say you have no label data I could never imagine that one day I could quote with syntaxes of cyclone to train a model or have a model not training a model or having a model without any label I can have classification seriously this is what is happening let's say you have no label you just have those movie reviews you will see that I got the same data but this time I just got X I don't want to get the label I have no labels and then I can still fit it to the model again we are not training any model here if you're not fine tuning open-air models although you can do fine tuning by sklem package for open YouTube but here I'm not doing any fine tuning I'm just saying that this is my given data that I don't have in the label but I know that this data should be positive negative or neutral so classify it for me that's it I did not predict again and look at that the same levels got predicted for me without even providing any label data this is pretty cool I know when you think about backend everything makes sense because it's using GPT they don't necessarily label data they understand what you what do you mean by saying negative positive but this is really really powerful if you could go all the way back to the time we didn't have such a thing and think about one day we will have such a thing that I'm discussing right now we would be amazed but we have it right now so use it how about this if I have multi-label classification for example I have a review of the movie you have to tell me if it is positive plus um what was the name of the movie as well some multiple labels for the same one review okay this is how we're gonna do it so we say that these are the labels you have to tell me for example the review is about quality price delivery or three of them two of them one of them maximum until three labels you can grab from here to associate it with the given input just one review by the way and the same thing I do not predict and look at that the first movie review is just about quality the second is delivery uh maybe it's not just movie by the way I mean when it's because product variety for movie doesn't make sense but look at that for some of the reviews I have two labels and I can have Max to three again without any label data that's it moving on these are examples about let's say sentimental analysis and topic modeling but more than that you can do vectorization or creating embeddings out of it just just import GPD vectorize and give your input X to text and that's it the embeddings are there you can do this one is I think doing summarization yes GPT summarizer imported from circular llm you can choose your model and say maximum 15 words Summarize each single input that I have in my X column that I showed you before and look at each row corresponds to summarize input from those reviews as an example um not only that you can do translation as well for example I imported a translation data set from this package this is the input language and I said that for GPU translator I want to have it as English and this is the result back for me and when I print it now the result is in English so I translated a given a problem to another language without necessarily typing in the prompt or having a prompt or I created the embeddings without even writing a prompt I labeled my data or categorized my data I did Central analysis without any prompt for parsing the output of the model just using something that I used to be familiar with easy to follow called secular not only that I even threw the documentation actually there are a couple of two more actually nice examples you can do also fine tuning for the models from GPD that they support fine tuning using similar secular syntaxes we discussed and also the other thing that I saw here is that let me go here you can even have few shots classification I I showed you zero shot but few shots would be the difference is that it will get some of the samples of your training data and add it to the prompt to tell to the model that hey here are some examples remember we had few shots in the prompt that you will give some examples of input and output to the prompt to let the model know better what you're going to achieve so if you do the same thing it will sort of randomize maybe some of the samples of your training data and add it to the prompt to make the performances slightly better so that was a quick review of what is this package I saw that it's it's upcoming to the trend I think they realized that almost three months ago well they are releasing new versions the last one is for three weeks ago based on the data I'm recording which is today September 10th to 2023 and it got 2.5 k stars and I thought that it's certainly Worth to go through that and showcase to you and give you sort of order of possibilities again if you're using such NLP tasks and you're planning for using open-air models maybe in nice and easy to start way instead of writing all the prompts is coming here check out what the prompts they're using give an idea or just install this package and move on and and we'll see what how you're gonna get the results with just saving further time and not needed really for you to develop your own custom class of parsing or prompt generation so on and so forth and of course kudos to all to the contributors of these open source package and I hope you enjoyed this video that's all there's nothing more practical than having a vision in life without a vision you are hopeless you are disappointed everyone can stop you with just putting one single obstacle in front of you you're like a leaf blowing an event and that's what not you're called to be you are called to be a Constructor Visionary of paradise or lesion that's really Who You Are you're going to lose whatever you have to lose and you're all together in this game so dream big my friends believe in yourself and take action to the next video take care [Music] thank you
Info
Channel: MG
Views: 1,490
Rating: undefined out of 5
Keywords: artificial intelligence, machine learning, openai chatbot, openai gpt 3, openai chatbot demo, open ai chat gtp, Azure, Azure Open AI, ChatGPT in Azure, Open AI in Azure Demo, Azure Open AI demo word embeddings, chatgpt advanced guide, Azure Cognitive Search with Open AI, SKlearn-LLM, Scikit-Learn, SKLearn, Scikit-learn with ChatGPT or LLM, Scikit-LLM: Sklearn Meets Large Language Models, GPT4, Scikit-learn + ChatGPT = Scikit LLM
Id: A-j5S3hRZxs
Channel Id: undefined
Length: 16min 29sec (989 seconds)
Published: Tue Sep 12 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.