How to learn data science in 2021 (the minimize effort maximize outcome way)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
kenji did an awesome video on how he would learn data science in 2021 there's also great videos from a christianity data professor they outline all the topics that you need for data science and these steps to get there and i highly recommend that you check them out so why am i even making this video well you see i'm not really the kind of person that can just motivate myself to go through a series of courses books or videos based on sheer motivation willpower or even interest i've tried in the past and i've failed miserably every single time i get that i need to learn like programming statistics machine learning and all these different topics what happens to me every single time is i'm like you know i could be doing data science i could also be laying in bed and watching cat videos my style is more like this video where i taught myself cycle from scratch in 11 days to get my current fang job i call it the minimize effort and maximize outcome method so in this video i'm gonna stand on the shoulder of giants aka these amazing resources that are out there already and show you guys what exactly i would do focus especially on designing a system that would set me up for success in completing the very daunting task of learning data science the topics to cover are [Music] programming basic stats data visualization exploratory data analysis machine learning algorithms both math and implementation data scripting such apis databases niches like nlp and deep learning and deployment and here is the general framework i would use one learn just enough two do a project and three iterate and the most important one accountability as much of it and whenever and wherever possible all right you guys can go now no just kidding i think you should stay because i'm gonna go through how to implement this framework to cover all of these data science topics step by step i'll also show you guys how i would choose projects finally stay until the very end of this video because i'm also going to tell you guys my top choice resources and how to use them i recommend starting off with programming python because it's the most intuitive multifunctional has the best packages for machine learning programming is the most important thing to learn because it gives you the toolkit who will do anything and here's what to learn about python remember learn just enough in general python that's variable declaration loops and what oop is for data science specifically you should also learn two packages numpy and pandas i would just understand how numpy works and then focus on pandas since numpy or maybe numpy numpy is the basis for pandas then you need to know some stats nothing crazy here we're talking like stats 101 like the first half of stats 101 i mean like mean median mode variance standard deviation correlation and distribution realistically if you've gone through high school math you have way more than enough already next up is visualization i prefer seabourn because i think it's the easiest and fastest to get a pretty decent looking graph it's built on matplotlib but i wouldn't bother digging into that until later if you know pandas it would take you like one minute to generate your first seaborne graph eda exploratory data analysis this is exploring a data set are there missing data how many variables are there how many rows do you have are there categorical variables continuous variables what's the distribution of each variable this is a combination of everything you've learned up to now python stats and visualizations at this point you have all the minimum skills you need to begin your first proper project and this is where personally i would start my first project if i were doing it from scratch but for those of you that want to get your hands dirty with ml so you can do your first proper ml project where you're a little bit more advanced let me also cover that quickly there's 10 to 20 common ml algos and one way that people divide them is into supervised learning unsupervised learning and reinforcement learning this is just one way of dividing there's actually several ways out there i recommend starting with theory or math and then the implementation the very minimum of math i think people who get really scared when it comes to math myself included the reason why i say theory slash math is because you don't have to know the exact math behind it i just mean that you need to know how the algorithm works like for example k nearest neighbors is finding the distance between a data point and all the examples in the data selecting the specified number of examples k closest to the data point and then voting for the most frequent label in the case of classifications or the averages of the labels in the case of regression then implementation theoretically implementation is simple just a couple lines of code but in practice there's actually a lot of nuance here your exploratory data analysis feeds directly into how you wrangle and transform data and which ml algorithm you ultimately choose to solve your specific problem this is why it's very important you should understand how each of the algorithms work first okay so you're now ready to do your first ml project my goal would be around 20 to 30 hours to get to this point starting from scratch and you know that doesn't sound like a lot right but that's because you always have to find out the temptation of diving too deep into each of the four areas just listed earlier remember learn the minimum and then do the project your covering of the basics is just the very beginning of the learning process your relearning starts when you start the project because doing projects is the absolute best way to learn for those of you that are curious as to why projects are the best way to learn it's because one there's a lot of studies that show practicing something yourself or aka doing projects is the best way of getting knowledge and skills into your head so you learn faster more deeply and retain longer this is because you're engaging more parts of your brain than just sitting there and passively consuming information two learning is a process is never ending and oftentimes feels overwhelming because there's just way too much to learn projects are how you scope it down to a manageable chunk and cement your knowledge it makes what you learn concrete and you get a great sense of accomplishment and completion when you finish a project and three it's how you keep yourself excited and motivated the idea of actually doing your own project and what you're interested in should be super exciting and i hope it is all right hopefully i'm convinced of the paradigm learn the minimum amount and do the project and the third part of the framework is the iterate iterate iterate dive deeper into each step of programming stats visualization eda and machine learning then add in other topics data scraping such apis for getting your data sets once you graduate from using pre-made datasets databases for storing data deployment of your ml models then start exploring niches like nlp and deep learning and computer vision there's so many more out there i'm not going to go into detail and cover all of these because this video would be way too long and there are many many resources that cover these very well keep watching for my recommendations before we move on i wanted to also address the last and most important part of the framework accountability this is the key to actually doing anything you know sometimes you watch a video like this one that tells you exactly what you need to learn and you're like oh yes makes sense i'm totally gonna do it i'm like super motivated and then two weeks later in school you usually get stuff done because if you don't there are consequences you don't do well or fail or get kicked out of school well since we're doing this by ourselves you got to create those consequences for yourself learning data science is hard and to set yourself up for success you have to try to incorporate accountability in every way possible so you don't just give up for me just deciding that i'm going to do something is definitely not enough in fact if i go and tell somebody that i'm gonna go do a project or something it's still not enough for me i guess like disappointing one person is not enough so you see what i have to do is i have to tell lots of people and potentially disappoint all of them my most recent example of this is if you guys watched my first vlog i said i would be doing my first nlp project so yes i definitely want to share my day-to-day life with you guys but another really big reason is to hold myself accountable i very deliberately stuck that in there very prominently that i was doing an nlp project because i know otherwise i would find lots of reasons to give up when things get hard see the fear of disappointing you guys got me through it and i did it i don't think you have to go and create your own youtube channel although i obviously think that's a great idea you can also post on linkedin or instagram or whatever other forms of social media communities are also great kenji is starting up six to six days of data again in 2021 and that is a perfect place to keep you accountable plus it's a great place to chat with others who are like-minded and ask questions alright resources how do we tie all this together into a solid concrete plan personally i would actually go for a paid course and that's not because the information is not freely available on the internet it's more so that if i don't know something i don't know what i don't know so i could spend a lot of time trying to like you know gather the pieces of information and try to like place it in the best way possible so i can learn you know the most effectively so yes i would go find a course that covers all the topics listed above and for each step if the course doesn't already have a project i would go do one myself as soon as i have the minimum amount of knowledge i've checked out some of the most highly rated courses out there like python for data science and machine learning boot camp by jose pertila introduction to data science using python by rakesh i'm not gonna say the name or else i'm gonna butcher it also thank you to one of the commenters that made me aware of 365 data sciences bootcamp i would choose any one of these courses because they all have great reviews and they all cover the topics where most of the topics that i listed above although for all of them the way that they're presented is not exactly the order in which i would learn them myself but realistically i'll be using several different resources anyway since there really isn't a perfect option that would suit anyone perfectly something that i've learned is that you shouldn't be married to a single course and feel bad if you don't understand something that the instructor is talking about or if you don't even finish every single part do whatever you have to do to learn enough so you can get started on that project and for statistics and machine learning algorithms whatever page resources you may choose to buy i cannot think of any paid resource that is better than statquest by josh starbot i was so starstruck when i was in the same convo as him like a couple months back because he was the one that she saved me from failing a grad school data science slash stats class quite literally and the way that he explains things is super intuitive and within animations his songs and everything else is also just like amazing i'm like fangirling right now i can also make an entire video on how to pick projects but here are my best tips if you're nervous and have a severe dread of failure like myself i would recommend just picking a project somebody has already done and then just adding a little more to it you can start off with the famous titanic data set in kaggle and get the most popular repo and just do another distribution or try a different machine learning algorithm then you could try switching out one data set or another data set see little by little you start building confidence and your projects become more self-directed as well as more complex and more awesome on youtube stat quest by josh starmer does full projects and so does cendex krishnak data professor and kenji so after i decided to commit i would announce on social media i'm going to be joining ken's 66 days of data is this a foreshadowing maybe and then personally i would go make vlogs and announce projects and timelines for when i'm going to go and do them so in this way if i don't want to do something i think about having to post that i'm not going to do it where it's delayed or something like that and that gives me lots of fear and anxiety that i'm disappointing people and making myself look like untrustworthy and unreliable person so i would probably just go and do it wow that is a lot let me know in the comments if you guys have any questions and i'll be more than happy to flush things out more there you go that is how i would learn data science in 2021 the minimize effort and maximize outcome rate see you guys in the next video and happy holidays to everybody
Info
Channel: Tina Huang
Views: 279,484
Rating: 4.9701891 out of 5
Keywords: data science 2021, learn data science in 2021, how to learn data science in 2021, 66daysofdata, data science, ken jee, machine learning, data scientist, minimize effort maximize outcome, how to learn data science smartly, how to learn data science, learn data science, learn data science for beginners, data science for beginners, introduction to data science, data science course, what is data science, best data science course, data science projects, data science courses online
Id: Axu4tJl8gbM
Channel Id: undefined
Length: 12min 16sec (736 seconds)
Published: Sat Dec 26 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.