The End of Data Analysts?!? (ChatGPT's Code Interpreter)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
nerds Chachi BT's new code interpreter plugin can do some pretty Advanced problem solving for my job like analyzing this data set and showing me some pretty incredible insights in a matter of seconds and a lot of people are claiming that this is going to take away dead analyst jobs so I've been testing this bad boy non-stop since its release and I have some prompts to see if it's true first test up is some Eda or just exploring a data set let's see what type of files this tool can take and it looks like a ticket the number of files let's go with CSV providing this data set with no prompt it takes the initiative to start diving into this and exploring what this CSV is even about it shows a sample of the first four rows of the data set along with some of the columns and all of this is done with python code which you can easily see it if you want to and that's not too bad let's move into a harder question asking it what this data set is even about it pretty impressively identifies that this is a collection of job postings related to data science roles where each row in the data set represents a different job posting it then goes as far to highlight some of the key calls in the data set with a description next I prompt it with your knowledge of the data set perform exploratory data analysis and it starts by identifying five main steps that it's going to take first it shows the data types of this and it looks like everyone's an object so if that's not really useful from there it even identifies there are missing values with a summary clarifying which columns have a significant number missing next it provides some summary statistics about the numerical columns it even goes as far as visualizing these distributions on its own and then providing an analysis of it points out that the salary data is skewed to the right which is typical for salary data that's pretty neat that provides this analysis of how this compares to expected values finally it identifies some Columns of interest that it wants to dive into further and it provides these visualizations showing what are the top 10 job titles and what are the top 10 companies and it looks like data engineers and data scientists are beating out data analysts and so if I want to export this code all I have to do is prompt it and this is pretty revolutionary so I think it's settled code interpreter it's taking my job well not so fast to answer that we need to look at this yes it's a blank Excel spreadsheet but hear me out in order to look into the future of what AI holds for our jobs we need to look at the past what previous tools have done to transform our jobs before spreadsheets migrated to computers they started out being physical papers that accountants would use in order to calculate finances yeah a physical paper my hands hurt just thinking about this there were entire departments with numerous accountants with the sole purpose of updating these paper spreadsheets then in comes the invention of the personal computer and these dudes when I revolutionized the way we work in this video you're going to see the future and so the electronic spreadsheet was conceived marketers everywhere began to over dramatize just how powerful this tool was going to be with some ads claiming it would take up to 150 jobs but you know what happened to accountants jobs after this well they actually increased accountants could now refocus their time from tallying numbers and focused on more important things like building those mind-numbing PowerPoints but seriously their attention now shifted into performing deeper analysis they can now use these tools and provide higher value with their jobs and with that history lesson maybe we can also infer where we'll go next with these AI tools all right so it's been a few days and I've been going through and using Code interpreter from a job analyzing and trying to find any limitations surprisingly I found quite a bit also during this openai released a new feature for custom instructions so I've customized my graphs a little bit and they're going to look a little bit different so let's dive into some of those limitations by doing a deeper dive of that data set that we were exploring before we're going to start with a new chat since my last one time down here I have a folder with a python file we exported last the data set and also a text file that I had chat gbt output last time that summarized all our analysis I can press this into a zip file for upload and then prompt chatgpt for familiarize itself with the contents of this file and it looks like it knows where we left off so we're going to dive into exploring the skills from these job postings specifically I want to see what is the most common skill requested and conveniently it gives me this graph which looks pretty good at First Sight what after diving into it I find that the highest number of occurrences of the keyword python is that 56 000 which that's not possible because there's only 50 000 job postings and chat GPT should know this it was in the summer this type of mistake is something I would expect a data analyst to pick up on and yet chat gbt doesn't so a repromptu TPT to fix this error and then I take it a step further by having it display these keywords as a percentage vice's occurrence and we finally get this visualization showing the top 20 keywords in data science job postings now I did the same analysis a couple days ago and I ran into even more issues the first time I asked for this it just gave me blank graphs and then it had the audacity to start hallucinating what the top skills were in this data set it basically reverted back to what it knew as a large language model Vice actually using the data provided I then re-prompted telling it that the graphs were blank that it needed to fix this error and it didn't really seem to fix it eventually I just prompted it to print after every step in the code it somehow worked itself out and ended up getting this final visualization so going back to that Excel history lesson yes this is a powerful tool but it still takes some sort of human operator to help guide and steer this tool on where it actually needs to go and make sure that it's staying on track this is especially true when we're diving into deeper more complicated subject areas but what happens if I need a quick ad hoc analysis of maybe a subject area that I'm not familiar with like something that's not data science job postings all right so I think I have a unique use case for code interpreter and involves this Alexa play Spotify so I have two outdoor speakers here and if you can't tell it's really not that loud there's a big problem that I'm having these speakers themselves are meant to be actually connected into some sort of amplifier and right now we just have them going right into Alexa when I go to Amazon and look for an outdoor amplifier I get a whack of results I'm not really sure which amplifier to choose so I searched through quite a few forums trying to find out what size amplifier I needed to get and it looks like it's really math based so that's why I think code interpreter is going to be perfect for this so I looked up the model number online and found the different specs that I think I needed and I gave this information to it so this is pretty crazy Chachi BT went through and knew it needed to calculate both the impedance and maximum power and it used some python code to actually calculate both of those things and determine what it needs to be so I think I found the perfect amplifier from this company called x-rong so now we just gotta wait for that amplifier to come in and we'll test to see if chat gbt was right so I've tested code interpreter to its limits and I think I've reached it now I have some bad news code interpreter probably not going to take away my job with these limitations one second thought it's actually good news so let's say we have some data online like this Google sheet full of data in the past I've used python to connect to this however it tells me it doesn't have the ability to access the internet prompting it further asking if I just pip install libraries it tells me this is prevented as a security feature designed to protect user data and privacy kind of get this because chachibi's previous plugins had issues when accessing the internet anyway because of all of this I now have to take an extra step of downloading the data that I need and then uploading it to chat gbt and my data is spread all over the place I don't just have it in Google Sheets but also have it in things like databases so I downloaded one of my databases to a CSV for analysis which at about a million rows was a pretty big data set when I tried to upload it it gave me this warning that it has a small ass file limit and an environment limit of only two gigs so I can't even get all the data that I have here into it to analyze which is like the first step of my job I found that most I could get in 200 000 rows of data so I was super disappointed in this now there are workarounds in chat gbt for connecting to an external data source like the plugin notable and I have a whole video on it but comparing but these tools although notable excels in some areas like data connections I find code interpreter performs a much more thorough analysis with less prompting anyway these internet issues and file limits aren't even the most detrimental issues to code interpreter so I pulled my subscribers on LinkedIn and Twitter I mean X and asked them what is stopping them from implementing this tool in their jobs and it was a resounding consensus that they had concerns with security issues you see these chatbots take these prompts and also data that you give it to then be used to build on and improve these chat Bots the problem is if it's confidential data it could be seen by the reviewers of this chatbot or even worse it could be fed back into this chatbot and potentially be prompted by another user and seen by them it's kind of a big deal so big in fact that Google has told its own employees not to put confidential data into their very own chatbot Bard that's like telling meta employees they can't use Facebook or Instagram this is a pretty big flag however with all these limitations I think there's hope for the future take mid-journey for an example an AI tool to generate art content look how far this tool has come in as little as a year so imagine we'll be in the future with these type of tools once we get through all these different limitations anyway the sample arrived today and we're going to go install it I hope it doesn't go X wrong [Music] after unpacking this bad boy I realized it had no instructions so I resorted to using Chachi BT to tell me how to install this and gave okay advice anyway after a few steps and nearly getting shocked and losing my life I was able to get it installed all that's left now so test that bad boy oh if you're curious about that video that'll use the notable plug-in it's right here and with that Alexa play Spotify here's always got value out of this video smash that like button that see you in the next one
Info
Channel: Luke Barousse
Views: 244,748
Rating: undefined out of 5
Keywords: data viz by luke, business intelligence, data science, bi, computer science, data nerd, data analyst, data scientist, how to, data project, data analytics, portfolio project, sql, excel, python, power bi, tableau, data engineer
Id: hoTS5pIKgPU
Channel Id: undefined
Length: 10min 39sec (639 seconds)
Published: Wed Jul 26 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.