How to Become a Data Scientist in 2024? (complete roadmap)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
over the last 2 years more than 400,000 people  have been laid off and data scientist role   were affected so why should you become a data  scientist in 2024 this is what somebody would   tell you to discourage you from becoming a data  scientist yes data scientist roles were affected   by the layoff but when you zoom out and look  at the bigger picture you will see less than   10% of the roles that were laid off belong to  data scientist 28% belonged to hrn recruiting   and 22% software engineering yes there were more  software Engineers laid off than data scientist   yes the job market is still soft and it will take  some time to catch up but this also makes it a   good time to start learning data science in this  video I'm going to be sharing a complete road map   that you can use to learn data science and land  a job so what does a data scientist do a data   scientist takes the data turns it into powerful  information that helps businesses succeed now this   is a very simple definition and after watching  this video you will actually have a really good   understanding what what this definition means  and if somebody were to come to you and ask   you what does a data scientist do you'll be able  to explain it a typical data scientist project   involves a few steps it typically starts with  data collection data processing exploratory data   analysis short for Eda feature engineering model  development model evaluation model deployment and   iteration so before we jump into the road map  there's one exercise that I want all of you to   do I want you to research the data science job  family and look at all the roles that are under   the data science umbrella there is typically a  few roles that stand out under the data science   job family data scientist data analyst machine  learning engineer AI specialist data engineer   analytics manager research scientist and many  more one of the mistakes that I made when I   started learning data science is I only focused  on the data scientist role and back then I didn't   know that so many other job roles exist in the  data science space and I could have considered   it if I had known about it so what I would like  you to do is become familiar with all the roles   that are available and figure out what would you  enjoy doing what do your target looks like before   you actually jump in into the data scientist role  itself because there is a lot of learning involved   and it will take time so you don't want to waste  your time you want to make sure like you are fully   dedicated to learning data science before you  commit to it because it will take a lot of hard   work let's say you have figured out you want  to become a data scientist generalist so the   next thing I want you to do I want you to narrow  down the companies and the roles that you would   be interested in then I want you to go to their  career website let's say you want to work at meta   as a data scientist I want you to go to meta's  career website and look at data scientist role   look at the job description and understand and  what type of candidate they are looking for what   skill set they are asking for what hard skills  and soft skills that they require is it a lot of   experimentation is it a lot of machine learning CU  that will give you a better idea of what you need   to learn and it will help you define your road  map and customize it take notes of all of this   then I want you to go on LinkedIn and search  for people who are already working as a data   scientist at meta look at their profile look at  their education history look at the type of work   that they're doing and take note again because  we're going to be using this throughout the road   map so let's say you have narrowed down that data  scientist role is what what you want to pursue you   have also done your research and you're ready to  go a lot of data scientist road maps tell you to   start learning python as the first thing but I  completely disagree with this advice and here's   why coding including Python and SQL are tools to  apply data science they're not data science by   themselves I'm going to say this again coding is a  way to apply data science it's not data science by   itself what data science actually is is statistics  and machine learning so that's the first skill I   want you to learn because let's say if you start  learning statistics andine learning and you   realize like this is not for you then I would want  you to stop and not pursue it further because in   order to be successful as a data scientist you  need to have the basic fundamentals coding is   a tool to apply statistics and machine learning  so what I want you to do is start building your   foundation with Statistics and machine learning  for statistics I would recommend that you   learn descriptive and inferential statistics so  descriptive statistics is basically understanding   what the data looks like and there are several  methods that you need to understand in order to   to learn descriptive statistics it includes like  measurement of central tendency basically mean   median mod measurement of variability such as like  standard deviation variance frequency distribution   measurement of shape how skewed the data is  and graphical summaries basically bar plots   and so on the next I want you to learn inferential  statistics and this is where it starts becoming   fun learn probability distribution hypothesis  testing one of my favorite subjects fasion versus   frequen statistics sampling time series analysis  experiment design multivariant analysis an NOA   survival analysis bootstrap resampling so these  are some of the concepts that you need to know in   order to have your Basics done in statistics next  we're going to focus on machine learning machine   learning divide the subjects into two Focus  area one is supervised machine learning and   then the next is unsupervised machine learning  so the first we're going to focus on supervised   learning for supervised learning we're going to  focus on linear regression and logistic regression   make sure you know these models in and out because  these two models coming up in an interview is very   very very common continuing on the supervised  learning then there is decision trees random   Forest neural network deep learning now again if  you have done the research and looked at the role   you would understand how much machine learning is  required so you might not need to go as in depth   into these machine learning models depending on  the job family or depending on the company and   the role that you're targeting but again if you  want to do a lot of machine learning then you   definitely should learn all of these things under  unsupervised learning focused on cians clustering   hierarchal clustering principal component analysis  this mon specifically and C's clustering I've seen   quite a bit come up in interviews along with  these topics understand these Concepts well   there are several on sources that you can use  to learn statistics and machine learning for   statistics I specifically love using KH Academy as  well as a YouTube channel called stat Quest that   you can watch videos and learn statistics there  are several books that you can read and also use   Chad GPT and Google where you don't understand  one of my favorite things with chpt is that I   ask it to explain complex subjects as if it was  explaining to a child and it does a really really   good job some of these subjects can be difficult  to understand so use all the resources at your   hands and to get a solid understanding of all the  statistics and machine learning so let's say you   figured out your fundamentals now it's time to  apply these skills remember I said coding is a   tool to apply statistics and machine learning so  now that you know statistics and machine learning   you're going to apply it and this is where you  need to learn coding so you can apply the skills   that you have learned so for coding hands down  learn p Python and SQL when I started learning   data science I started with r I worked with r  for 2 years and then I switched to python one   of the things my team realized early on well not  early on after two years of doing this that it was   really hard to communicate with the engineering  team taking our R code and then deploying it was   very difficult and it increased our project  lifespan anyways I have covered this topic   in a different video you can watch this video  to see why I prefer python over are for python   you should understand how to do data analysis in  Python no libraries such as pandas numai matplot   Leb for machine learning you should know python  libraries such as psychic learn tensorflow py   torch depending on the model that you're using  there's so many videos available online so many   resources online where you can learn python the  next coding language that you should learn is SQL   you will be using SQL quite a bit when you are  accessing the data because most of the times the   data is sitting in a database so understand how  to write SQL know how to join the data set you   should know all the joins Left Right outer inner  select statement how to filter your data how to   do aggregate functions and how to do subqueries  how to do Advanced analytics functions so by now   you have statistics and machine learning covered  you have coding you know Python and SQL the next   thing you're going to focus on is learning the  tools so for tools what I would recommend is that   you understand how to work with notebooks such as  duper notebook and Google collab notebooks because   chances are that you're going to be using those  for doing your data science work writing your code   there and doing most of your data science project  there understand how the code review process works   because the chances are that that at your job  you will be following the codee review process   so learn all the git commands and understand  it fully how it works and how you can push the   code how you can pull the code how you can deploy  the code and things like that and obviously these   tools would vary depending on the company that  you end up so I'm not trying to focus too much   on this right now because you will eventually  end up learning more tools as you start doing   your work as a data scientist let's say you don't  want to become a data scientist and you want to   become a data analyst in that case I wanted to  share this data analytics program by cre Foundry   who is also sponsoring this section the video the  program has elements of generative AI so it not   only teaches you data analytics but also teaches  you how to use generative AI in your day to-day   work to be more productive and work smarter  the learning material is organized in three   segments first you get intro to data analytics and  cover various topics then data immersion which is   hands-on experience on all the topics that you  have learned so far again great way to learn   anything new is learn something and then apply  this is what I really like about this program and   lastly you get to pick a special specialization  you either get to pick data visualization with   python or machine learning specialization this  is amazing because the specialization allows   you to go deeper into specific data analytics  topic you'll get access to expert mentors and   tutors and get to graduate with a professional  portfolio in data analytics and most importantly   there's job guarantee that ensures that you land  a job within 6 months of graduation or they will   refund your tuition use the link below to learn  more and get 20% off now let's go back to the   video so by now you have statistics and machine  learning fundamentals done you know coding you   know the tools now I want you to spend some time  learning the business fundamentals and product   fundamentals this step is specifically important  because what differentiates a good data scientist   from a great data scientist is having a solid  business understanding having a solid product   understanding so you can build solution that is  helping the bottom line of the business and the   product understand how the product thinks how  the product development life cycle works and   how you will plug in as a data scientist into  these products and the business areas to help   them grow and succeed because knowing having  a good business and product understanding will   make you a superstar data scientist and on the  topic of Superstar you need to also have solid   communication skills because the chances are that  when you do your project at the start and at the   end of the project you will be communicating  with business you will be communicating with   engineering you will be communicating with  product and many other job families so for   you to be a great data scientist you need to have  a really good communication skills where you can   communicate to a tech audience and and a non- tech  audience so get really good practice whether that   is doing in a presentation form or doing it in a  written form having good communication skills will   help you grow in your career way further than  somebody who does not have these communication   skills and we're going to practice this when we  build the project portfolio so let's say you have   the basics done you know statistics and machine  learning you can solve problems by applying these   fundamentals using the tools that we mentioned  you have a good communication you can understand   how product and business think now I want you  to apply these skills by doing projects this is   going to help you in two ways it's going to help  solidify your knowledge and second it's going to   help you stand out on your resume when recruiters  are looking on it five is typically a good number   of projects that you can focus on you can do more  or less just make sure that they're representing   the skills that you know remember we took notes at  the start of the video where you looked at the job   description and you looked at the people who are  already working there this will give you a good   idea what type of work they are looking for what  type of work these people have done to help them   get a job at your target company which in this  case is meta so now you'll figure out what kind   of project you're going to do for example one of  the things that stood out to you is was probably   a lot of experimentation and AB testing so one  of the projects that you will do here is around   hypothesis testing including experimentation  design doing the analysis and then presenting   insights and recommendation from that hypothesis  testing the second project let's say you can focus   on time series forecasting third project could be  customer fraud detection this is a classification   problem the fourth project let's say you are  building you build a recommenders system the   fourth project I guess B on the fourth project you  can do some Eda exploratory data analysis and do   some feature engineering so build your project  portfolio in a way that shows unique skill set   in each project so a recruiter reading your resume  can tell that okay you know all of these things if   you're looking for data set there's so many free  data sets available on websites such as kaggle and   Google data search you can find free data set  there and start doing your projects now we're   not done yet in order to landar the job let's say  you've done the projects and it's help you get the   calls for the interviews now getting the calls  for the interviews is not enough you actually   need to sit down in the interview and pass the  interview in order to get the job so this is why   I want you to do the interview prep responding in  an interview setting in front of a stranger in a   time constrainted setting can be nerve-wracking  I have personally blanked out in interviews on   Concepts that I knew very very well so this is why  getting practice on these interview questions it's   very important to for data scientist interview I  typically like to divide it into three segments   one is behavioral second is coding and then third  is fundamentals on statistics and machine learning   under statistics and machine learning a lot of  case study focused interview prep for example   they can give you a problem in the interview they  can say we want to improve our customer turn rate   tell me what exactly you will do they obviously  will not give you very clear problem like I just   gave they will give you a lot vague problem they  would expect you to define the problem and then   solve it using a solution that you would have to  pick and you will have to tell the pro and cons   and uh show why you actually pick that solution  in my opinion A lot of the time people end up   focusing on coding which is fine but you also  need to have a solid understanding of how to   solve these case studies for these specific data  science Concepts that are the core skill set of   a data scientist role notice in this video I did  not give you a timeline I didn't say that you can   learn this skill in 3 months or 6 months every  person has unique needs and unique abilities   it might take you 6 months to do this but for  someone else it might take them 5 months or it   might take them 12 months so take your time learn  these skills and if you found value in this video   give this video a thumbs up and let me know in  comments your data science journey and if you   have any questions thanks for watching I'll  see you in the next video have a good one bye
Info
Channel: Sundas Khalid
Views: 143,429
Rating: undefined out of 5
Keywords: data science, data scientist, self-taugh data scientist, big tech, Machine Learning, Python, data science projects, data science tutorials, data science jobs, AI, big data, data analytics, business analyst, sundas khalid, data analyst 2024, data scientist 2024, how to learn data science in 2024, machine learning roadmap, chatgpt, bard, faang, data science roadmap 2024, data science interview, machine learning course, tech jobs 2024, machine learning for beginners, ai engineer
Id: mrOIT6v8_0g
Channel Id: undefined
Length: 14min 9sec (849 seconds)
Published: Sat Jan 20 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.