Machine Learning in R - Classification, Regression and Clustering Problems

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
well done if you got this far you already have an idea about some machine learning problems let's get a bit more specific and discuss three common types of machine learning problems classification regression and clustering these are all very broad topics so I'll stick to a brief introduction in this video to get you familiar with these techniques the three last chapters of this course go into more detail first up is classification a classification problem involves predicting whether given observation belongs to a certain category remember how I compared machine learning to the estimation of a function well based on earlier observations of how the input maps to the output classification tries to estimate a classifier that can generate an output for arbitrary input a classifier can then label an unseen example with a class the possible applications of classification are very broad for example after a set of clinical examinations that relate vital signals to a disease you could predict whether a new patient with an unseen set of vital signals suffers that disease and needs further treatment another totally different example is classifying a set of animal images into cats dogs and horses given that you have trained your model on a bunch of images for which you know what animal day depict can you think of a possible classification problem yourself what's important here is that first off the output is qualitative and second that the classes to which a new observation can belong unknown beforehand in the first example I mentioned the classes are sick and not sick in the second example the classes are cat dog and horse in Chapter three we will do a deep analysis of classification and you'll get to work with some fancy classifies moving on a regression problem is a kind of machine learning problem that tries to predict a continuous or credited value for an input based on previous information the input variables are called the predictors and output the response in soft sense regression is pretty similar to classification you're also trying to estimate a function that map's input to output based on early observations but this time you're trying to estimate an actual value not just a class of an observation do you recall the example from the last video there we had a data set on a group of people's height and weight a valid question could be a cell relationship between the height and the weight for example will it change in height correlate linearly with a change in weight and if so can you predict a height of a new person given their weight these questions can be answered with linear regression here you're trying to fit a linear function between the predictor the weight and the response the height together beta0 and beta1 unknown as the model coefficients or parameters as soon as you know the coefficients beta0 and beta1 the function is able to convert any new input to output this means that solving your machine learning problem is actually finding good values for beta0 and beta1 these are estimated based previous input to output observations I will not go into detail on how to compute these coefficients the function LM does is for you in all now are you asking what can regression be useful for apart from some silly weight and height problems well there are many different applications of regression going from modeling credits goals based on past payments finding the trend in your YouTube subscriptions over time or even estimating your chances of landing a job at your favorite company based on your college grades all these problems have two things in common first of the response or the thing you're trying to predict is always quantitative and second you will always need input knowledge of previous input to output observations in order to build your model the fourth chapter of this course will be devoted to a more comprehensive overview of regression so classification check regression check last but not least there is clustering in clustering you're trying to group objects that are similar in clusters while making sure the clusters themselves are dissimilar you can think of it as classification but without saying to which classes the observations have to be long or how many classes there are take the animal photos for example in the case classification you had information about actual animals that were depicted in the case of clustering you don't know what animals are depicted you will simply get a set of pictures the clustering algorithm then groups similar photos in clusters you could say that clustering is different in the sense that you don't need any knowledge about the labels moreover there is no right or wrong in clustering different clustering can reveal different and useful information about your objects this makes it quite different from both classification and regression by there always is a notion of prior expectation or knowledge of the result and intuitively straightforward clustering method is k-means this method will cluster your data in K clusters based on some similarity measure more on this in the fifth and final chapter of this course well enough theory for a while it's time to roll up your sleeves head over to the exercises and tackle some classification regression and clustering problems
Info
Channel: DataCamp
Views: 60,563
Rating: 4.7859778 out of 5
Keywords: Cluster Analysis, Machine Learning (Software Genre), R (Programming Language), Statistical Classification, Regression Analysis
Id: 6za9_mh3uTE
Channel Id: undefined
Length: 6min 39sec (399 seconds)
Published: Sat Dec 05 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.