How To Learn Data Science Smartly?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
👍︎︎ 1 👤︎︎ u/aivideos 📅︎︎ Aug 23 2019 🗫︎ replies
Captions
hello all my name is krishna and welcome to my youtube channel today we are going to understand how to learn data science now many of my subscribers are basically asking what are the different steps to learn data science how we should proceed so i'm just going to give you a very small example by drawing this big diagram and don't just get worried by seeing this particular diagram because it has a lot of details into it and uh basically what how i'm going to explain is that how what is the process that i have actually applied for my transition towards data science in the similar way i'll basically be explaining you so to begin with in the center i have data science now in data science you have to know various things like at least one programming language uh i would like to rank python as first then r and then java but i would prefer python or r because they have lot of libraries and with the help of those libraries you can implement various machine learning algorithms then when we consider machine learning in machine learning there are various techniques like supervised unsupervised machine learning technique in reinforcement machine learning technique and many more so in that you will basically be having problems like classification problem regression program reinforcement learning and as you know that deep learning is a subset of machine learning so i have mentioned over here deep learning and dimensionality reduction apart from that we also have clustering algorithm i would also like to specify clustering algorithm so most of your problem statement actually revolves around this kind of scenarios only and you basically be using machine learning for that and deep learning is a subset of machine learning where you will be actually implementing those with the help of neural networks and then after that you need to know some tools like ide right and this id basically is the integrated development environment like basically what kind of editor you are basically using for coding uh for coding this python and our programming languages and there are many so for python i've just taken an example there is a tool called spy chomp which is very very nice you have jupiter and you also have something called a spider now these are all tools are basically present apart from that for our programming language you have our studio and for python also you have something called as va studio okay visual studio that is also very nice tool basically where you can basically write you can debug your code and do a lot of stuffs so you need to have a knowledge of one of this ide if you are going ahead with python make sure that you have one of this if you are going ahead with our know how to work with our studio because that is also a very good idea then you need to know web scrapping now web scrapping it's not like you have to know extensively but yes in some of the scenarios when you need to collect data you basically will be requiring this web scrapping and for that there are various libraries like beautiful soup there is tool like scrappy and you'll also be using this another library which is called as url lib so this will actually help you to read some data directly from some url uh you know in the form of json apart from that also you have various your libraries in machine learning like pandas and numpy that will also help you to do it you need to have some basic knowledge of web scrapping apart from that maths as i said you yesterday also in my uploaded video i told you detailed what is the role of maths in data science specifically i would like to focus on statistics linear algebra and differential calculus because most of the algorithms basic is basically on this particular concepts itself then you also need to know data visualization now data visualization basically have written tableau power bi so these are you know different tools where you'll be able to do a lot of data visualization stuff uh apart from that in python and in r you have different libraries like matplotlib c bond which will help you to actually do a lot of visualization with respect to your code that you're basically writing and then you go to the data analysis stage also this is also a very important step where you do feature engineering data wrangling explorative data analysis and lot of thing now from all these components that you have basically seen you have to follow one more thing at least so suppose in this you select a programming language you should know all these particular algorithms and try to learn each and every algorithm understand the match behind them it is not like you have to just solve it somewhere write it no just understand how this is basically getting implemented because the main thing is your data science use case based on this you will be using different different techniques okay suppose i have a use case where i need to predict the house prediction for a particular city right so what are the things that i'll require first of all i'll require data right i may take it through web scraping i may be dependent on some third-party apis right then after that i may do some data analysis on that particular data like feature engineering data wrangling exploratory data analysis and then when i am selecting the algorithms i will actually do data analysis along with that i'll just see uh i'll just also apply some kind of maths to and select a particular algorithm perform that algorithm with the help of the mathematical techniques that i have learnt in this and then i will be able to implement that apart from that before implementing this particular machine learning algorithm we may use some uh libraries like matplotlib c bond tableau or power bi to actually understand about the data how the data is basically distributed what form the data is basically distributed whether it is forming a normal distribution whether it has standard normal distribution can i convert that into standard normal distribution you know whether i have outliers in that particular data whether the data is imbalanced whether there are a lot of things that you can basically rule you know take out more information from that particular data now the main stage is this particular data analysis now understand guys i have a use case when you do one or two use case considering all these particular stages in mind right it is not like you have to study everything separately take a use case start solving it i know many of them know python programming language or programming language all you have to do is that take a particular use case you know see an example suppose i'm taking the house prediction right what i'm going to do with that first of all i'll require data now currently i don't have to do this also because data is readily available okay for the first use case since you're practicing since you're making a transition carrier towards data science right so for that you take a use case try to understand what that particular use case is all about you know and then apply all these techniques whatever you require and as usual the first technique that you will basically be applying is data analysis then the second technique that you may do is that after this data analysis you'll also be utilizing the data to understand more about the data and third thing is that you will basically be doing a machine learning algorithm selecting seeing that whether it is a classification problem regression problem what algorithm you are going to apply everything will be coming over here right and then obviously you'll be selecting one id that is for sure you will be using pycharm jupiter spider so one of the best known ideas for doing python programming language or implementing any machine learning algorithm even if you if you if you remember like if even a lot of you know companies like uh amazon which is basically aws cloud azure are basically providing integrated jupiter id so where you can basically code it and deploy it in directly into the production again deployment part is completely different from this deployment part you'll have one more scenario where this will be with respect to your deployment okay now in this particular deployment what you are going to use is basically different different tools like aws azure or you may also use spark sorry it should not be sparked because spark comes into big data so aws azure you may basically use some suppose if i take an example of aws you may take a ec2 instance and deploy a flask model over there right you should actually integrate your model with the flask framework and try to upload it in aws and create an api so that it can be consumed in the front end so how did i study how did i learn first of all i took a very basic use case i as as i took the use case right i was basically you know reverse engineering each and every steps of this how the data analysis was done so let me just uh remind you that one of the use cases was already done regarding house price prediction and that was freely available uh okay first use case that i did was with respect to iris data set iris data set so basically we need to classify what kind of iris flower that is based on the sample length and sample width okay and this particular data set the solution was clearly given in the internet itself then i explored it done reverse engineering did the data analysis part did the visualization part came to know a lot of things okay and initially when i was learning python also you should remember that i was not perfect in pandas using pandas numpy and all it is all through reverse engineering that i focused i tried to understand the subject more much more properly i got more and more knowledge now when you do reverse engineering also you will be able to understand a lot of stuffs now one example i did with iris data set now whatever domain i was working in with respect to the business knowledge i could actually apply the same use case to that also i could create a new use case so that i can solve that use case with the help of machine learning or deep learning or let it be data science itself so i was able to create a data science project and i was able to do it and that is how even though you're working in some different domain if you give a idea that you can solve this particular problem with the help of machine learning that will be a very great use to the company people even your managers will actually appreciate that particular work because you're able to give them you're trying to solve a particular problem with through the machine learning techniques itself okay so this is how you have to go you have to basically do a reverse engineering try to follow this part pattern see the use case it is not like every project can be created like a data sense project understand a use case and based on that use case can you solve the problem and for that what are the steps that you're basically going to apply and this is the whole diagram that i have drawn in front of you i've included everything and you have to become perfection you have to bring some perfection based on the reverse engineering that you're doing in various things okay let me just give you one more example so for the first time when i was handling category feature i just used to know about one hot encoding okay now later on i got scenarios wherein i have many category features and if i was performing one hot encoding it was unnecessary creating so many columns so i found out a different way how to handle that scenario and as i said that it is all reverse engineering i was based i was basically doing a use case and that problem came at that particular point of time so i thought how could i solve this i did a lot of research and finally got lot of inputs from the data science community you can handle this like this there are a lot of competitions that was done i saw kaggle competition i checked out kernels that were freely available and i was able to get in that knowledge okay and that is how you have to do reverse engineering in each and every state and try to fix all and i mean try to resolve the particular problem and come up with a very good accuracy the more you do reverse engineering the more better it is okay so don't follow a part where you'll be separately practicing this separately practicing this or separately practicing this or separately but practicing this i have even worked in tableau and power bi to get some visualization knowledge and because of that we will basically understand how the data is basically distributed and that will give you knowledge uh because you're able to learn something from the data right and if you have some of the stats concepts you can derive a lot of insights from that particular data and this was the technique that i basically used to learn the data science and i think you should also use this technique to learn smartly you just trust me in this because you will be able to complete if you are able to solve this particular use case understanding the machine learning algorithm you will be able to do it very very quickly because the same concepts you will be applying in another use case i know let the data set be bigger not a problem you will face some problems while handling bigger data set but that will be only for the first time later on you will be much more comfortable comfortable so this is all about this particular video i hope you like this particular video please share with all your friends please do subscribe subscribe my channel i want this community to become little bit more larger let people know about it this all content are completely for free um and i appreciate help i've just crossed 10 000 subscribers that's a great feeling so thank you one and all i'll see you all in the next video have a great day thank you
Info
Channel: Krish Naik
Views: 1,066,866
Rating: 4.9526763 out of 5
Keywords: data science learning path, how to learn data science quora, data science course, data science tutorial, learn python for data science free, analytics vidhya data science course, data camp, the most comprehensive data science learning plan for 2018, upgrad, coursera, great learning, appliedaicourse
Id: csG_qfOTvxw
Channel Id: undefined
Length: 12min 12sec (732 seconds)
Published: Thu Aug 22 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.