This Will Change Data Science as We Know It (ChatGPT)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
this is so good it's even specifying the types over here it actually did the job unless you've been living under a rock this thing has been popping up everywhere and for a good reason so me having a background in artificial intelligence I'm of course naturally interested in new developments like this and I've been putting it to the test but mainly just to play around with it see its capabilities but this day I was recording a new video for my YouTube series where we cover a complete machine learning project from start to finish using Python and I came up with the idea to put chat GPT to the test with an actual data science problem so the series that I was recording today was all about outlier detection how to detect outliers in sensor data and the results that I got were so good that I was like okay I have to stop what I'm doing right now I have to make a video about this because you guys have to see this before chat GPT turns into a paid model or something like that so now it's a free as you can see right here it's a free research preview and this is free for everyone so let me show you what I did this morning and let's see how awesome this is so in a YouTube series that I'm working on we're working with sensor data that is measuring accelerometer and gyroscope data not really important for this video right now but we want to create an outlier detection algorithm that can go over the first six columns so the ones that you are seeing right here and loop over all these numerical values and identify whether there are any outliers and then mark them as either true or false this is basically the request that I put or gave to gpt3 so let's see how it works so here we go create a python function that can Mark columns from a pandas data frame as outliers using the IQR method let's see what we get okay so it's thinking okay here it's starting okay so two Mark column says out Liars using IQ elements okay now it's actually creating code for us so a function Mark outliers IQR really good name so we have a bundles data frame and we have a column and then okay what does it do q1 Q3 looking good IQR yeah okay what else you got oh we got we even get an example wow okay so let's check this out so without changing anything we copy the code and we come back over here let me clear this up and now this function takes as input data frame so that is our day F and let's say we want to take the first column in our data frame so let's run this wow okay so we're getting an outlier column so it's all nands is that correct there are actually true values in here so it actually did the job so we have a function that can take a data frame as an input and a column and an output that same data frame with a new column called outlier and we can even make this even better to say okay we don't call this outlier but you say call and then plus and we do an underscore let me check so now we should have yes so we know which column it is okay this is really awesome and so now what if we say for call in outlier columns we do this and then we change it to that let's start with the fresh data frame again check the result for all of the six numerical columns within our data frame we now have a series indicating with either true or an N whether the value is an outlier or not so wow that is actually amazing right this is so cool let's let's do one more test so now create the same function but with the local outlier factor or love method so so let's see if this this actually works okay here it goes yes it's using a scikit learn local outlier Factor this is so good it's even special specifying the types over here so data frame is supposed to be bonus data frame and a column is supposed to be a string and output is a Ponders data frame this is such a proper way of writing a python function and I typically never do that because I'm too lazy to write it out so also like the name of this function so first Mark outliers IQR and then LOF is using the abbreviation of local outlier factor it's just like how can it make sense of all of that I find it so interesting this is legit a better function than I would be able to come up with on my own also like all the it's probably commented it's it's beautiful wow all I can say is that I am impressed like for real like artificial intelligence is here like it's here in front of us we can use it and it's free for everyone just go to chat.openai.com and play around with this I will definitely be putting this more to the test also for my my data science project but just from looking at this test alone it's there are so many possibilities with this and of course you still need your expert judgment and experience as a data scientist to determine whether this code is actually useful and can be applied to the problem at hand but this can save you so much time looking up certain syntax and writing out the specific structure of python functions and commenting everything like this is just it's beautiful it saves so much time and I think this can really help people that are new to data science as well like like just just showing by looking at examples of okay how do I do this how do I structure a function how do I comment code you basically have a mentor that you can look up to and then look okay how is my mentor in the in this case the AI writing this code and then you can learn from it and apply it to your own problems it it is really fascinating and this is this is going to change data science as we know it and not just data science coding and I would even go as far as saying like the world in general this is such a huge leap in like technology technological Improvement I would say it's just I'm really impressed man so go ahead and try it out play around with this is actually really fun now that's what I wanted to show you in today's quick video and I should probably get back to recording the video that I was supposed to be recording today now if you find this video helpful then please consider subscribing to the channel for those of you that are new here my name is Dave I work as a freelance data scientist I'm also the founder of data Lumina which is a coaching business for data professionals that want to learn how to start a business and on this YouTube channel we make videos about machine learning python data science and also freelancing so if that's your thing consider subscribing please like the video and then I'll see you in the next one wow what the actual
Info
Channel: Dave Ebbelaar
Views: 8,544
Rating: undefined out of 5
Keywords: Data Science, Machine Learning, Python, chatgpt, openai, machine learning projects, data science hack, data science tips, ai, ai for data science, chatgpt for data science, openai for data science
Id: NDaOTA6bTrk
Channel Id: undefined
Length: 6min 57sec (417 seconds)
Published: Thu Dec 15 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.