Ask AI to analyse Pandas DF - PandasAI powered by Open AI

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
would be great if you can just ask questions to your tabular data in this case if you are using python pandas is a library that you would be using a lot so you can just chat with pandas using pandas AI yes you can use artificial intelligence capabilities powered by open AI or other open source libraries like openness system and ask questions to the pandas data frame and get answers back and that is what we are going to see we're not just going to see about the library itself but we are going to do a Hands-On with the Google collab notebook I'll share the Google collab notebook with you this is the collab notebook that they have given as part of the repository so it's not my completely own creation we are also going to see certain aspects of it for example we're going to first learn what is this Library this library is pandas AI you can do pip install pandas AI so when you install this Library you need to also install openai library and after you do that you can now just simply ask questions from your pandas data frame right example if you have got a data set like this you can just go and ask which are the five happiest countries and the above code will give you an answer like this and then you can start using it you can ask more questions and it will start giving you answers the one of the most important thing is you need open AI API key for this or you can also use open Assistant API key in this case you need to use your hugging face API key to use open Assistant which is another open source large language model it is released under MIT license which means it's a permissive license and you can use it for a lot of things now that's enough of talking now I'm going to get started with the Google collab notebook I'm demonstrating this on Google collab notebook the same will be linked in the YouTube description but you can use this on your local Mission as well you don't have to necessarily do it on Google collab notebook the first thing that you need to do is you need to install all the dependencies in this particular case we are going to install pandas Ai and openai pandas AI is a main library and open a we are installing it so that we can authenticate the current session or the pandas AI session with our open a APA token once you have successfully installed those two libraries now you need to do four inputs the first one is pandas the second one is pandas Ai and the third one is open Ai and the fourth one is open Assistant just a disclaimer at this time of recording May 1st open Assistant was showing some error there is a separate GitHub issue that is tracking that I hope like it's a very small issue the developers will fix it so as part of this demo I'll show you the code for open Assistant but we are not going to run in the code for open Assistant because like I said May 1st there is an error but most likely in a couple of days this error should have been fixed now we have imported all the required libraries now the next thing is we need a data frame we can import or download any data frame but in this particular case I've decided to use the same data frame that they've given as part of the example so in this data frame we have got three columns country GDP happiness index we can in fact see the data frame df.head and that will display the data frame for us so we can see what all are there in the data frame country is there GDP is there happiness index is there I'm going to use this data frame to ask questions using artificial intelligence powered by open a for that to happen we first need to get the opening APA key I've got a lot of tutorials where I have shown you how to get open a APA key get your open a APA key and then add it in the variable open AI underscore API underscore key and now you can add it as part of the APA token and use the wrapper open a and assign it to an object called llm now we can use this llm inside our pandas AI wrapper so we are going to instantiate a new object called pandas underscore Ai and that is going to take the llm and use this pandas AI to create that object and that is going to help us question the data set or the data frame in this particular case so what we are going to do is we are going to say pandas underscore AI equals pandas AI of the llm so the llm that you created here is what we are going to use to question this and to start with after we have instantiated this particular object pandas underscore AI now we have to say pandas underscore AI dot run so that means we are expecting it to run this particular query and what is this query here the first object or the first argument that we are going to send it to this function or the method is the DF the data frame that we have created the next one is we want to give the prompt in English which is the actual question so the natural language question that we want to ask or answer from this data frame is what we are going to ask here so in this particular case we are going to ask which are the five happiest countries before we even see the answer let's look at this we can see that the five happiest countries in this case we can see its uh happiness index is United States Canada Australia United Kingdom and Germany so these are the five countries in this particular order as you can see 7.3 7.3 7.3 7.27.0 these are the top five countries in terms of Happiness index now we didn't specifically mention happiness index in our question but we said which are the five happiest countries and as you can see it says according to the data the top five happiest countries are United States Canada Australia vuk and the Germany and Germany and you can see it has given the answer in the right order so which means we know that this solution Works before we move on to the next question I would like to quickly show you how you can do the same if you are using open source or open Assistant as a large language model so the first that you need to do is you need to Define your hugging face API key so hugging face if you go to your hugging face account you can go find the token where you have got the hugging face API key that APA key is something that you need to assign for this object HF underscore API underscore key once you have successfully assigned it there then you need to create another object so like the same llm we created you can create another object if you are not using open AI then you can write it in the same name llm but because I already have something called LM I'm using a different name here o a underscore llm inside that I'm using open Assistant which we imported at the start of the code so you can see we have imported open assistant so the same open assistant is what we are going to use here so we're going to use that open assistant and then say the API token is HF underscore APA underscore key so while open Assistant does not require any API key so the way this program accesses open assistant is through hugging face APA so that's why you could see this API key is being used but you can modify this code in such a way that you you do not need to use an API key but then you would need a GPU to run as you know I'm not running this on a GPU I'm simply running this on a CPU that's why it's using the API route just like I said before on May 1st this code does not work especially the open Assistant code but I hope this code will be fixed by the developer in the future that's why I decided to give you the code in and itself then it's quite simple all you have to do is create a new object and use pandas Ai and use open Assistant underscore llm as the object here like as a parameter and then you can use this to run now I'm going to move these two to the bottom of the code so that you know it doesn't disturb whatever we are doing but this code will be shared as part of the Google cooler notebook so in the future if you want to use open Assistant you can directly use the code here so let me can say this is open assistant so now getting back to our first code which we ran to say pandas underscore AI dot run which are the five happiest countries now it doesn't like for me let's say I don't want just the country names I want to know what are the GDP of the countries that are happier for example I want to know if these are the top three happiest countries I want to know their respective GDP what is the GDP of United States what is the GDP of Canada what is the GDP of Australia now let me go ahead and then create the question what to say what are the names and GDP of the happiest countries so as you can see it has ran the code and it is giving the response the happiest countries and their gdps are listed as follows United States with the GDP of this one Canada with the GDP of this one and Australia with the GDP of this one and if we go back to our data you can see U.S Canada Australia and we have got the right set of numbers the GDP numbers so to show you that I'm not actually joking around I'm going to run the same thing again so but a different question what other thing that we can answer from this question so we can say what is the country with the lowest GDP we can say what is the countries with lowest GDP let's ask what are the countries the countries with lowest GTP I ask put a question mark ask this question it's going to get us the answer so let's see according to the data the countries with lowest gdpr these things Japan Australia Spain Canada and Italy and you can see Spain let me reorder it Japan Australia Spain Canada Italy Okay cool so that order is perfect Japan Australia Spain Canada Italy however it's important note that GDP is not the only factor that determines a country's overall well-being the happiness index is also taken into account so that's that's another important aspect that it tells us that GDP doesn't allow doesn't alone define whether the country is good to know go or know let me ask one more question um if I don't even answer if I have to choose a country to migrate which one will you recommend based on GDP and happiness index if I have to choose three countries let's say three countries migrate I'm honestly not sure if it will answer because this is not a direct question for the particular data frame in itself okay it says based on GDP and happiness index I would recommend the United States China or Germany as a potential countries to migrate so as you can see that it doesn't simply answer only from the data center while while I mean like it uses the data set data to answer you can ask more nuanced questions that can give you a better response about data I can very well um already imagine like how people in C suit like CEOs cxos who always like to ask simple questions but a lot of organizations waste a good data science resource for doing these kind of menial tasks which is quite obvious because all you have to run is a simple SQL query or simple Panda code can answer this but now we know that pandas AI can answer this but there are certain concerns that I wanted to highlight before I close up the video first thing is this uses an API whether you are using open assistant or whether you are using um the open AI it the data is actually being sent to the particular server particular service provider the large language model service provider and it gets back for a lot of companies especially in data science data is the Holy Grail which means not a lot of companies would be happier to send their data to external service if you are a small company or if you are in the business where you do not care about this or if you are using a hobby project this works pretty well it's quite fine but if you're part of a corporation where data is the Holy Grail then this may not be applicable so please consult your legal department before you use anything like this the second bigger issue is this because it uses large language model it is still prone to prompt injection a technique where somebody can manipulate your data and somebody can delete your data which means the same prompt can be used to manipulate and give a wrong answer as well so keep these two things in mind first the legality of using this in your own setup the second one is prompt injection I've got a separate video about prompt injection which I'll link in the YouTube description but this in simple language it's a way to use prompt to give undesired result or manipulate the output of a large language model so these two are the concerns because this uses large language model but other than that this is a wonderful project it's it's a project that I think will put smile on a lot of data scientists who are not worried about losing their job of course but this can be used in conjunction with pandas not a replacement of it like I said and you can see go start the repository give a shout out to the developer this is an amazing repository pandas Ai and the Google collab notebook both these details will be linked in the YouTube description if you have any questions let me know in the comment section otherwise looking forward to hear your comments see you in another video Happy prompting
Info
Channel: 1littlecoder
Views: 10,544
Rating: undefined out of 5
Keywords: ai, machine learning, artificial intelligence
Id: cGA9Yabg0Yc
Channel Id: undefined
Length: 13min 44sec (824 seconds)
Published: Mon May 01 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.