What Does a Data Scientist Actually Do?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
however one can here back with another video for you today I'm talking about the types of work that you can expect to do as a data scientist and how much time you can expect to actually spend on these tasks as usual if you find this content interesting please hit that like button and if you want to see more content like this please subscribe to my channel so the first activity that you're going to be doing as a data scientist is project planning and I expect this to take about from 0% to about 20% of your time on a day-to-day basis now in some companies this might be reserved for more senior members but I think in the most successful companies everyone from the director of data scientist of data science to a junior data scientist would be involved in this process this is thinking about the prioritization of each of the projects the grooming and gathering of the requirements for you know the actual projects as well so you can't expect to do a reasonable portion of this but I don't expect it to take a large portion of your day most teams build this into grooming sessions or have set aside time to work on these activities the second type of work that you can expect to do as a data scientist gets a lot of talk surrounding it and this is the data cleaning data manipulation data aggregation and the querying component of data science it's all manipulating and working with the data and you can really expect to be doing this from around 20% of the time to 60% of the time it is a large component of what you do I will say on different teams it can really vary greatly so if you work in a large corporation that has a data engineering team they might be building out tables in the database that you can just select star from or you might actually be querying all the data yourself and even scraping the data and bring it in and cleaning it so there's gonna be a sizable element of this day to day task in any data science role that you are involved in I'm a big believer that this skill is really important to cultivate and even put together a couple quick you know video series about how to clean it manipulate and aggregate data you can see those above the next component of data science is actual data analysis this is an algorithm building this is looking at the characteristics of the data and finding trends this is also creating cool visuals for example in tableau and power bi in our shiny or in Python that plot lib plotly - etc there's a lot of really cool stuff you can do here and some data scientists might be doing this almost full time I would expect you to be spending between 10 and 50 percent of your time on this type of activity to me the data analysis is really where data science starts to get fun there's an element of creativity involved you also start to really learn very interesting things that are that can be actionable you don't have to build a cool algorithm to be able to create business impact you can separate things into different groups and understand the trends of each group and that can create real business value so you know some people might look down on this and say oh you can do this in Excel well that's absolutely true that doesn't make it any less useful the next component is actually building machine learning models and or deep learning models basically model building in general now this is after you've had the clean data you're evaluating different what what features are relevant to the model what how you can tune the parameters to make a model as successful as possible and you're also trying to understand the evaluation criteria as well as possible you know you're looking at things like accuracy precision recall things like that and those are all very relevant to the type of problem that you're trying to solve so thinking about a problem across all those different dimensions is really important I don't expect this to take you a tremendous amount of time I mean there's great ways to automate parameter tuning and great ways to automate some of these other things so I think most data scientists spend around 10 to 40 percent of their time focusing on this area and the last component is is actually the implementation of the models so this might be looked at by a lot of people as an engineering task but I think the onus is a lot more on the data scientist or a machine learning engineer to actually implement the solution that they create so you have this model now what do you do with it you have to make it into an API endpoint for a for another product to hit for your webpage to hit or for for it to be useful to someone so not all data science comes out with an answer or solution a lot of the time there is a production model that is being used on a day to day basis things are running in real time live and that's an important skill to cultivate and that's also something that requires a lot of work a lot of engineering so if you're a data scientist this could take up from zero percent of your time you know you're just doing the the previous steps to almost sixty percent of your time where this is the main focus of your job is taking models either you create or others create and putting them into production and making sure that they are creating value for the business as you can tell all of these things are part of the data science lifecycle so you know the planning data aggregation descriptive statistics actual model building and production ization and that's how you should be framing what your work is like I mean there's people that do most of their work on the front end of that cycle and then there are people that do most of their work on the latter latter half of that cycle and when you're interviewing for a job or when you're looking at a new position that's something you should consider where are you fitting into this cycle and is that where you want to be in the cycle you know some for some people the data engineering component is very interesting and for some people the like implementation component is really interesting so that's something you should think about in terms of your own goals your own likes and dislikes as usual thank you so much for watching this video and good luck on your data science journey
Info
Channel: Ken Jee
Views: 21,507
Rating: 4.9648681 out of 5
Keywords: data science fundamentals, data science tutorial, data science class, data science learning, data science course, data science, ken jee, big data, machine learning, python, pandas, data science python, data science for beginners, tutorial, data science student, data science tips, data exploration, joma tech, siraj raval, python programmer, tech lead, Ken Jee, Data Science Work, Data Science Day in the life, What does a data scientist do, what do data scientists do
Id: XWetgrNas-k
Channel Id: undefined
Length: 6min 27sec (387 seconds)
Published: Thu Jul 18 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.