ChatGPT Code Interpreter - This Will Change Data Science Forever

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
this video is sponsored by brilliant teddypt's code interpreter was just released this plugin allows us to upload data write and execute python code do that analysis generate reports and even download the code and reports generated in seconds this is like having a junior data analyst that will do all of our work 24 7. and in this video I'll show you everything we can do with the cat GPT code interpreter so let's get started alright the first thing we have to do is enable call interpreter so we open the sidebar and then click on the three dots then we select settings and we have to go to better features and here we have to enable code interpreter so we enable this option and will be able to work with a code interpreter okay now I'm going to close this window and then what we have to do is Select gpt4 and you should see in the drop down code interpreter so we select cool interpreter and after that you should see a plus button in the chat bar so that plus button is going to allows us to upload new files new csb file or any type of file that you want to upload to captivity in this case I'm going to upload this population underscore total.csv file and we're going to work with this file to do some data analysis by the way you can download this data set in the description below okay I'm gonna click on open and as you can see now the CSV file is being uploaded and now is uploaded to child DBT and what I'm going to do is use this prompt and this prompt test act as a data scientist and analyze data set and make charts and graphs to show the major Trends in population growth around the world so this data set is about population of different countries for many years and in this case I'm letting chat DBT choose the charts and graphs for this analysis so I'm going to press enter and we'll see all the analysis that chatty please gonna make on its own all right now let's see everything captivity is gonna do with a code interpreter first it gives us a quick overview of the data set as we can see here uh this is these are the three columns that our dataset has and then it automatically detects that there are seven missing values here we can also see that the code is being generated so here it's using pandas to read a CSV file and then to spot these missing values is using that is null method and then the drop in a function to drop these missing values after that is doing some descriptive statistics to give us some insights about this data set and finally it's giving the visualization that we asked for in this case we didn't specify any type of visualization but it generated a bar plot using model lip and then this line plot using multiple lit 2. okay this is very cool we didn't do so much prompting and we got all this analysis in some seconds now let's export all this report in a PDF so I type this answer it in the form of a multiple Pages PDF download and charge it is going to provide a link for us to download all these analysis in a PDF format all right now I have the link so I click on download the PDF and after this I'm gonna get a PDF file I'm gonna download this PDF file and if I click on it you'll see that we have all the visualizations we asked for so first the bar plot and then the line plot so it's exactly what we got but now in PDF format that you can analyze it without all the explanation that Chad GPT gave before now we can also export all the code generated in a jupyter notebook format and we're going to see how to do this in the next example alright for an example we're gonna do a lot of prompting to get a more customized report so I'm gonna upload another data set in this case if this player's underscore 20. CSV which is information about soccer players in the game FIFA 20 and this data set has like 100 columns so unlike the previous data set with three columns now we're gonna work with a lot of data and well I'm gonna upload this CSV file which is going to also be in the description and after do this we're gonna type the following prompt and here's the prompt first I'm telling child gbt to filter only some countries so I want to analyze soccer players from some specific countries in and well that's my first instruction then I'm asking to generate some visualizations and the first is a bar plot then I want a histogram and a box plot then as color plot and then a pie chart in each visualization I'm giving some extra instructions of the variables that I want to analyze let's see if the cool interpreter can handle all of this remember that is a lot of data there are many columns and many rows and we're giving some specific instructions to get a customized report so first it's giving an overview of this data set it's telling us that it has 106 columns bad then it's going to select some columns and it's going to filter only the countries that is specified so so far so good then we have our first visualization which is this bar plot that was generated with this uh this code then it's generating the histogram and the box plot and well it looks great so we have the histogram and the Blogspot that we ask for then it's generating also the the scatter plot and finally it's generating the pie chart in case of the pie chart it's using some weird colors so what I'm going to ask is to change the colors of the pie chart so I only ask this and as you can see now I get different colors for different elements in this pie chart finally I'm going to ask cardi BT to add the link to download this report in PDF format so now as you can see I have the link and I have all the visualizations here in this PDF file and the last visualization the pie chart is the pie chart with all the modifications that we asked for so with the different colors and that's how we get this PDF now we can also download the code generated into IP ynb format which is the format that is used for Jupiter notebooks all right now I'm going to click on download the Jupiter notebook and if I open this file we're going to see all the code that was exported into these ipy and B file and now we can make some modifications to the code if we want to but now let's put 10 dbt's code interpreter to the test with multiple files so I'm going to upload multiple files multiple data sets and then we're gonna make a data analysis based on those data sets so I'm going to upload five data sets from the FIFA game in this case players 17 players 18 19 20 and 21. so each data set represents the game in each year so in the year 2017 2018 2019 2020 and 2021 and we're gonna make a visualization using the five data sets and as you can see unfortunately we cannot upload the five files in one shot but we have to upload them separately now that all the files were uploaded I'm going to use this prompt so here I'm telling charity a little bit about these five data sets then I'm telling it to create a new column in each data set and then I'm giving some instructions describing the type of analysis that I want to do so in this case I want to get a line plot with the evolution of the ratings the FIFA ratings for these five players from the year 2017 to 2021 then I press enter and it starts with the analysis so it first tells me that the data was loaded for the players that I specified and then is creating the line plot that I asked for so now as you can see first it had an issue but then I it found a solution for the overall column that is in this data set and then we have our line plot so we could verify that RDP is called interpreter cannot only analyze one single data set but also multiple data sets and it generates all these reports with this visualization and we can export all this into a PDF file and we can also export that code to read in Jupiter notebooks and that's very cool but if you don't develop your analytical encoding skills you won't be able to know what's for example a positive correlation or why you have to drop that saving missing values that were found in this data set remember that at the end of the day as a data analyst or data scientist you'll have to make some decisions based on different factors and that's why you have to develop your analytical thinking in a free and easy way to develop your analytical thinking and improve your coding skill is using brilliant.org which is the sponsor of this video brilliant is the best way to learn math data science and computer science interactively it has thousands of lessons from foundational and advanced math to data science with new lessons added monthly I love the data science course on brilliant because it helps develop my analytical thinking and within a few quick lessons you get to analyze real-time data and draw interesting conclusions from it also with brilliant I can learn and review this and more topics through problem solving which isn't about memorizing formulas or equations but learning how to think and this is very important when it comes to developing your analytical thinking to try everything really has to offer free for a full 30 days visit virgin.org the pie coach the first 200 of you will get 20 of brilliant annual premium subscription alright that's it for this video let me know in the comment section if child deputies School interpreter is going to make the life of data scientist and data analyst easier or if it's going to replace them and that's it for this video I'll see you on the next one
Info
Channel: The PyCoach
Views: 13,656
Rating: undefined out of 5
Keywords:
Id: Jkn5g-wep1I
Channel Id: undefined
Length: 10min 13sec (613 seconds)
Published: Mon Jul 10 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.