How I'd Learn Data Science In 2024 (If I Could Restart) - The Ultimate Roadmap

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
by the end of this video you will have the blueprint to become a data scientist in 2024 videos on this topic are usually optimized to be as digestable as possible for YouTube but a 12 minute 25 second video will only allow you to walk away with a few bullet points and a lot of confusion today I will be giving you so much more than that the ultimate road map I'll not only tell you the most effective cell taught route to take but also give you different projects you can do and in what order at different stages of your journey the resources that I use to learn data and most importantly a bunch of different ways to stand out in a sea of data scientists strap yourself in thousands of people get into this journey they spend countless hours hunched in front of their laptop away from friends and family learning all these different programming languages just to get to the end and say yeah I studi data science but I realize it's not really for me that's what happens when you don't answer these three simple questions before you start studying data science do you know what data science is do you know the skill set you need to become a data scientist and does it sound like something that you want to do so how do you get the knowledge to answer these questions you need to do three different things research what is data science online just an hour or two is enough to give you the gist look up a couple of industries that you're interested in and see how they use data science and the third thing you need to do is look at what a typical day in the life of a data scientist looks like research the typical work days are you happy with the amount of coding they have to do the amount of stakeholder interaction they have and the amount of presenting they have to do for example give yourself the solid foundation before diving head first step two dive head first the two most important skills we need as data scientists are maths and programming so how do we pick which to learn first to me the answer is simple programming definitely programming that's because even if you're not great at maths you at least have a level of familiarity with it from your high school days but programming can feel like entering a whole new world programming is also how we'll often be implementing the math that we do you learn so it makes sense to have this solid foundation of programming before you begin the major programming languages we have as data scientists are R and Python and don't waste time on this I'll make it super simple for you python just pick python your end goal here is to be as Employer as possible so just pick python why well when I look at job postings it's like this 60% ask for python explicitly 30% ask for r or Python and very few ask distinctly just for r with no alternative for another language so if you choose R you will be excellently positioned for these jobs because there is a lot less competition but the problem is those jobs are much rarer and you'll be hamstringing yourself for all of these jobs so just pick python please the next step is to pick a simple commonly used IDE a simple analogy to understand what IDE are if tomorrow I decided I wanted to write a book in English I'll have to decide what software I would write that book in msword Google docs scrier the software is the equivalent of an IDE so what IDE should you write your python code inside of again I'll make your life easy pick something people use or at least what I use vs code congratulations you now have your basics in order now what do we have to learn in Python good news I'm not just going to give you the basics in Python but also great mini projects that you can Implement to solidify your knowledge these six Basics are things that you use every day and some of them every single line of your coding Journey without a good grasp of these it is effectively impossible to code for data science so the first thing I want you to learn is data types and what you can and can't do with each one of these within python you probably intuitively know a few of these such as ins are basically whole numbers floats are decimals and strings are just like words and words and numbers and a combination of everything in between the second basic after that that I need you to get a hold of is a assigning variables and again this is pretty straightforward assigning a variable is basically giving your object a code name that you will refer to it as throughout the rest of your code after that learn about lists next and lists really are super simple to understand they are effectively ways of storing different items together within python so let's say instead of always referring to items individually you can use a collective name a simple easy to understand use case for this is if you had four variables which were a country name England Wales Scotland and Northern Ireland instead of always writing them out one by one you could just put them into a list called the UK okay next up after that is dictionaries and how they work they really are sort of like lists except stored in pairs so if you wanted to include more information you could do that so instead of just storing the countries in the UK you could also store each country and its capital city okay I'm going to stop giving examples because it might start to get confusing but trust me it's super straightforward it might just just be because this is your first exposure to these Concepts okay now we're on to the basics of Panda's data frames if we want to be over simplified a panda's data frame is essentially a fancy way of saying a table okay a data frame is a glorified table but you understand how tables work in Excel okay good well you're at least on your way to understanding how data frames work in Python okay are you feeling confused at this point if so it's absolutely fine these might be new Concepts to you and I promise you as soon as you start learning and getting the hang of these Concepts You' be like I remember when I used to struggle with what a list was trust me progress is inevitable okay now I need you to learn just three more things and we'll be done with the basics and you'll see just how much we can do with just the basics when we get to the projects okay the fourth basic I need you to learn are basic control flows these are IFL statements for Loops while loops and all of these are basically what they sound like it might take a little bit of grasping but you'll be fine after this I want you to learn a basic visualization Library such as matplot lib Seaborn but personally I prefer plotly it is just a little bit nicer looking and it just lets you make simple charts to visualize your data as you go along and the last last last absolute basic that I want you to learn are functions and how to define and create functions functions are basically predefined bits of code that you can call at any time to avoid writing the same code again and again so let's say for whatever reason in your code you're going to need to divide numbers by two then multiply by five then subtract three you could do that manually every single time which is super time consuming not to mention boring or you could write a function where you say hey every time I give you a number I want you to divide it by two multiply by three and then subtract four and then all you have to do is call that function and it does it for you much simpler okay okay I promised I'm going to stop giving examples because it could get a little bit confusing but anyway when When You Reach This stage give yourself a pad on the back because you've come through some of the toughest parts of learning how to code but all of this knowledge that you've just gained is absolutely useless unless you apply this one principle from now on every time you pick up two to three skills I want you to implement the principles over here and that principle is project-based Learning Without applying these skills to a project that represents real life you will instantly forget exactly what you've learned and you'll think why would anybody care about lists or dictionaries that mindset will change when you apply to a project so these are three beginner friendly projects to solidify your skills that you're already beginning to build the first one is to create a simple contact book application in other words within vs code I want you to create functions that will allow you to create a contact add that contact to a contact list find the details of a contact update a contact's details and then also delete a contact if you need to do so the only skills that you'll need are the ones that we've learned so far the second project actually Builds on what we've just learned but it's an inventory management system so I want you to be able to create an item with the price find the details of that item but I also want you to have a till balance that updates every time somebody makes a sale or a return and the last project is the simplest of the three I want you to write a function that takes in an Excel file converts it to a data frame and then Returns the basic descriptive statistics of that file and all of these projects we can do with just the previously discussed skills now you know what to learn how to apply it but where do you actually learn it if you are going self-taught there are two options each with its own pros and cons the first one might actually be the true self Tor route which would be looking each one of these up on YouTube and looking to learn that way it's super cheap and On Demand but there's a lot of drawbacks it lacks structure and when you learn A New Concept from one Creator they don't know what concept you already knew before that so they might refer to knowledge that you do not have yet that's why honestly I think it's worth just Shing out the money to get a course or a a boot camp especially considering how much you will make once you become a data scientist the major advantages of this for me is the structure it will give you a lot of structure and importantly you can still get all the information that you would have gotten with this method over here and in that way you can supplement your knowledge from the course it also has disadvantages it's not free but still let's be honest way cheaper than a degree and trust me I would know and it's also gamified that was a big problem that I had with my course and what I used when I was learning was Data Camp as it has skill tracks in Python SQL and anything else that you can think of I do have my gripes with data camp but I still use it to this day so I will leave a link for it in the description for you to be able to check out the courses that they do offer and I'll be making a video in the next few weeks teaching you how to effectively learn using an online course so subscribe so that you don't miss that so everything that I'm saying might seem like a lot but if you subscribe to my newsletter you'll get a written road map of everything that I have shown to you but more importantly than just a written road map is an insights that I will be sending you one to two times a month insights that you can only get from working in the industry and the things that aren't great for the YouTube algorithm but will help you to not just land a data science job but improve as a data scientist head to datan nash. co.uk pop your email into the box and you'll get a road map and a free subscription to my newsletter okay so now we have our base in programming we can introduce some concurrency into our learning that means learning two things side by side we'll be doing more complex work in Python that will be tailored to making you more employable and I'll touch on that in depth a little bit later because the other thing that we're learning for now is mats now I don't want you to download the entire contents of a math book into your mind so we want the best bang for the book the fundamental mathematical Concepts that are asked for by all jobs so that math includes a basic understanding of some statistical Concepts like median mean the mode data standardization variance and standard deviation ketosis skewness correlation and covariant and you can read the rest on the screen after that we also have these important linear algebra topics and linear algebra provides the framework for many data science operations and algorithms but the key Concepts you should focus on are systems of linear equations vectors matrices igen values and igen vectors normalization and distance calculations after that is some B basic calculus and trigonometry and you can see the four major areas I want you to focus on differentiation integration limits and trigonometric functions and finally this is an important one we need to understand probability and the concepts in particular I want you to familiarize yourself with are hypothesis testing Invasion probability conditional probability probability distribution and expected values take a breath it's just maths it's going to be okay and unlike High School where you had to write out pages and pages of maths by hand and get a nice thick red X every time you were wrong the goal here is mainly to understand the underlying mathematical Concepts so that you can interpret it and use it to Aid your decision making most of the time there will be python libraries that can do the implementation on your behalf and your job will mainly be structuring the code around it and interpreting those results and remember you don't have to be a master of these Concepts but just have a good grasp of the fundamentals you will be fine one step at a time remember so what can you use to help you learn the math well here are three excellent channels that I recommend in this aspect stack Quest redlick mats and three blue one brown and let's not forget my favorite resource which is random Google articles those are pretty good to teach you some maths now don't forget our number one principle Project based learning so now we will combine our math skills and our programming skills to solidify our knowledge so the first project can you code a function that will calculate the moving average of a series of numbers and plot the output in a graph in Matt plot Li after that can you code a basic statistical function that takes in a list of numbers calculates the mean median and mode variance and standard deviation and for variance and standard deviation I want you to implement that manually just to make sure that you have a good understanding of those Concepts the third thing code a function that calculates the dot product of two vectors for this one again I don't want you to use any like Li iies at all no numpy okay for the next one can you code a function that takes in two matrices firstly checks if multiplication of those matrices is possible then if it is possible multiplies them otherwise it returns an informative error and then besides that I want you to explore what you can do with these two libraries in particular numpy and scipi and of course let's not forget my favorite one the fourth one random Google articles which have saved me on more than one occasion we have now reached the whole goal of our data science journey and this is one of my more controversial takes but armed with just the basics of maths and programming you should start applying for jobs but maybe not in the way that you expect I want you to put together a CV that shows off your data skills and the projects that we've done so far but here's the key part we aren't spending 4 hours a day applying for jobs at this stage instead we're just gently prodding around the market and mainly apply for entry level roles and internships that do not require full customization of the application because right now we probably won't get the job but Nash what's the point of applying if we won't get the job good question two simple reasons and the second one might be more important than the first the academic year of my masters was a 10-month process but I secured my first internship to work as a data scientist in January of 2022 4 months into my masters when learning data science we often view the process like this I'll s of having no skill as a data scientist and nobody will want to hire me and then eventually I'll get through all the courses and get that one final skill and all of a sudden people will be dying to hire me the reality is that the skill acquisition chart looks more like this where as you gain skills your level of employability Rises you do not know where on this axis your first employer will be willing to take you on as a data scientist so by podding around with your CV you get to get your first job as early in the process as possible the second reason is all about downloading data the last thing I want for you is to spend months picking up all of these skills and then applying for hundreds of jobs and never hearing anything back and trust me this will happen if your CV is awful by doing a bunch of simple applications as you learn you get to test your CV you might put in 20 to 30 easy applications and hear nothing back and that is a sign to tweak your CV and see if your response rate improves maybe your education sector needs to go below your work experience or change the length of your CV consistent easy applications will allow you to to experiment until you have reached your optimal CV with the basics of pipe and mastered we can now have a lot more fun in this part of your journey I want you to have a lot more autonomy around which areas you dig into firstly by having a curiosity mindset in addition to this curiosity mindset I also want you to adopt just in time learning so as opposed to just thinking hm I don't know anything about web scraping let me just randomly learn how to web scrape the more effective way is to always be working on projects and then when there is a project that demands for you to learn how to scrape you put aside time to then learn it to move your project forward as opposed to learning things just in case but before digging into the potential areas of specialization I will make mention of these three areas that I advise you have a solid understanding of well four areas the first is basic data pre-processing and feature engineering but I also want you to know what supervised learning is and the basics of how to do that the same with semi-supervised learning and supervised learn after this common areas in which to specialize or get deeper knowledge in the first is natural language processing anomaly detection predictive modeling recommendation algorithms marketing mix modeling computer vision and general machine learning are all good areas in which to get deeper knowledge this doesn't mean you have to only specialize in one of these at this stage but these are the areas that you commonly see job postings for now with all of that knowledge you should be in a good position to actually get a job and the key is to have excellent projects that that appeal to employers regardless of which specialization area you're looking for it can be difficult to think of new projects that appeal to employers and a platform that I actually recently discovered that allows me to think of great projects is called project Pro they literally have industry-leading standard projects and in my opinion are more advanced than what you can find online in general so whenever I need a new project that's the platform I go to and I will put a link in the description for it there are a lot of advantages as you can see listed but the one thing is that there's subscription price does reflect the standard of the projects that it does contain so only do this if you have the funds too afforded and one those really really top projects to stand out quick interjection people I've actually spoken to the people at project Pro they've agreed to give you a 5% discount if you use the link below just to make it a little bit more affordable so I think first do free projects the ones mentioned in this road map see what you can do on your own time after that then when you really want to take your project to the next level if that's what's holding you back project Pro is a great place I'm using it mainly for learning how to implement lrm Solutions but they have so much more than that so yeah link below 5% if that sounds interesting to you if you can't afford that there's plenty of cheaper and free options the first one is taking on the projects that are within your course this also has a lot of advantages but also a lot of disadvantages such as being quite generic focused on skill display rather than being employable and often times they spoon feed you to get through that project but from the free options what I would recommend is using your own internal knowledge and curiosity for example maybe you have a background in customer service and decide I think it would be useful to write some code that would tell you the sentiment of customer reviews about our product frequent questions that come up from customers and the frequently mentioned reasons for poor reviews now imagine that you are an employer who does e-commerce and you see that project from this person your mind will instantly think oh wow they could bring so much to my company if they can translate that to us because we want to know what our customers are thinking I have a whole video here explaining how to do effective projects to actually get employed so you can open that in a new tab or add it to your watch later and to do any project you will need data so familiarize yourself with kaggle.com which is a website where you can get free data to do your personal projects with okay so now we have good python good projects and good fundamentals but so do a lot of data scientists we now want to be hyper valuable and pick up up additional skills that will put us head and shoulders above the competition the first of which is SQL which is a great querying and database creating language it's excellent and is mainly used by data engineers and data analysts but it's still very valuable for us as well compared to learning the basics of python learning the basis of SQL is pretty straightforward but a few key areas I want you to focus on are how to query how to create a database including reducing to 2 NF and 3 NF format working with relational tables and foreign Keys as well as elements like creating temporary tables and some easy window functions and partitioning again what I used to learn all of this when I was going through my soft tour phase was Data camp and their SQL Developer track which was actually really good and once again we've picked up a new skill so what do we do Project based learning employers don't care about you telling them you can do SQL they want to see that you can do SQL so for these projects we can have a dedicated SQL project which is just you showing how to create a database nothing wrong with it perfectly fine but option b I think is integrating it with your existing data science projects so before we're building a predictive model on a Cagle data set now firstly create a database for that data set reduce it to 3 NF then do the necessary joins to get the columns you need to build your prediction model on top of that I'm linking this free kaggle data set down below so that you can do that if you wish to now the next secret weapon as data scientists we often do not pay enough attention to the front end the customer facing aspect of of our projects we just concentrate on getting good at the coding and then leave it to the data analyst to make it look pretty but a lot of companies can't afford to have a dedicated data analyst so they're looking for a data scientist with the ability to present their findings and not just throw a random Jupiter notebook at them so the next thing that you should do is become competent with the visualization software and I do recommend Tableau when presenting your work to employers and recruiters you now be able to show it off both as code but also as a really appealing dashboard that Crystal izes the work that you've done the best part is learning the basics of Tableau won't take you long at all so there's no reason not to take a weekend or two just to learn the basics now with Tableau Python and SQL in Your Arsenal and continued work on all three of these you should be well positioned to get your first job where before we were being casual in a job search we are now being really intentional with our job hunt really take the time to fix your CV now and have dedicated time to apply for entry roles that you think you can get it's not just easy apply anymore take the time to customize your CED where possible for different jobs list your experience and projects in a nicely ordered Manner and I will be doing a completely separate video on this but in the meantime here's some information on how to increase your odds of getting that first job as well as a couple more videos that I've done around this area that I will be linking down below as well listen you will feel stuck at different points during your data science journey and if you do go down this path solo it will get extremely lonely extreme quickly so you need to find a community of other people who are getting into data science or this area in general for moral support but also to discuss problems and look to solve these together you can look for communities online in the shape of forums provided by the courses you pay for social media groups and those sort of things and I'll be honest I don't have experience with either of those but I do have experience with networking on person which is an amazing resource that I've had great results with and I do have a video that exclusively discusses how to network effectively but the one thing that does do is limit your community to those who are local to you and that's a huge missed opportunity which is why I'm looking to form a community to solve these problems that you can sign up for in this community I'll be having study sessions and regular calls with the members to provide more tailored advice and mentorship on accelerating your data science Journey it will be for dedicated fellow Learners and experienced professionals who don't just want to be mediocre data sors but want to work their way up to being truly great it will provide an ecosystem for growth support knowledge sharing and it's it's just a space where you can ask questions share insights collaborate on projects and get feedback all of which are essential to accelerating your progress if that sounds like something you want to be a part of sign up for the weight list below to get Early Access when this community does go live if we're thinking about the 8020 rule we're now definitely in the realm of nice to have things rather than Necessities but these definitely would make you one of the outstanding candidates on top of everything else that we've already already learned is learning how to work with apis and use them to fetch data that can change dynamically instead of the static csvs we've been working with when downloading data off of kago also learn the basics of GitHub and these Basics are getting your projects into your GitHub repository so that other people have access to your code as well and something else that I'm learning is streamlit which allows you to easily turn your code into an interactive web application that other people can use and more on that coming on the channel soon but that's that's a super useful skill and the last one this is very much an extra he is posting about your journey onto platforms like LinkedIn and Twitter which are particularly useful cuz they're more professional at times and if you have a digital footprint it shows that you're slowly leveling up and it could help you to stand out but don't dedicate too much time to the documentation at this stage and now we are on to the final element which is The Cutting Edge data science is a field that is always in flocks so you need to remain up to date with the latest Trend and the three best ways that I look to do this firstly medium and towards data science these platforms are Treasure troves of Articles and tutorials insights and so much more by data science professionals and enthusiasts whether you're looking for in-depth tutorials case studies or thought-provoking discussions on the latest AI or machine learning techniques these are pretty good although you do have to pay a couple bucks a month a free alternative to this is YouTube which of course I'm quite biased because I am on YouTube I think there are a lot of smart data scientists on this platform who can give you so much information so subscribing to a few channels is always a good idea and the last thing is following experienced data science leaders on other platforms again mainly Twitter and Linkedin those are excellent resources in order to keep you on The Cutting Edge and there you have it the best freaking road map on this platform and yes this year I'm talking my so don't forget to subscribe to the newsletter to get written resour ources to everything that I've talked about and at this stage you might be feeling a little bit intimidated wondering if you have wanted taste to become a data scientist I have this video over here that addresses whether you are too dumb to be a data scientist so click on screen now
Info
Channel: Data Nash
Views: 23,967
Rating: undefined out of 5
Keywords: data science, data analytics, data science job, data engineering, tina huang, study md, ali abdaal, ken jee, how i would relearn data science in 2023, how to become a data scientist fast, 2023 data science roadmap, how to learn data science, How to Become a Data Analyst in 2023?, sundas khalid, data analyst road map 2023, how i would learn data science in 2024, how to learn data science in 2024 (if i had to start over)
Id: 6DxBaphvap4
Channel Id: undefined
Length: 26min 42sec (1602 seconds)
Published: Thu Jan 04 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.