Bias Variance Trade off

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hello everyone welcome to the semicolon this video is a part of the series data analytics with Python and today We're going to learn about bias-variance tradeoff For those who are wondering why we need to know about bias-variance tradeoff Well, it is because whenever you use a machine learning model You would want your data to have a perfect fit you would want it to be consistent and have less errors in prediction and You would also want similar results when trained with similar data sets and hence understanding of what bias and variance and their trade-off mean in a machine learning model is very important. The errors in machine learning model are mainly bias and variance errors There is little irreducible errors, but since it cannot be reduced we shall not talk about it So bias is mathematically the expected error in the predictions of a model It can be simply understood as errors which we make due to assumptions in the model For example, when we try to solve a problem by applying linear regression to it we are assuming that the target has linear relationships with its features which may not be right and the errors due to the linearity in the model in this case are bias errors Having high bias means the model is under fitting. Bias is formally defined as the expectation of predicted value minus the actual value which is error so it is expectation of error So, variance is a measure of variability in the results predicted by our model so to put this in a simple way variance quantifies the difference in prediction when we change the data set so when we have high variance it means that our predictions are going to be very different when we give the same test case So having high variance signifies our model is overfitting. So to understand biasing various errors better Let us look at this bull's eye diagram the red circle represents the target having low bias and low variance results in our model predicting values very close to the target as you can see the the values predicted are on the red circle which is our target Having high variance and low bias means the predictions are highly variable Such a model is not consistent. We get high variation in the predictions and Having high bias and low variance means the model is consistent but the predictions are far away from the target even this situation is not desirable and having high bias and high variance is a total disaster The Model is not consistent and the predictions are highly variable So let us take an example say we have two training data sets Our task is to fit a curve on these data sets Remember a machine learning models accuracy is a measure on how good it does on the test dataset so if we assume a linear relationship between the target and the features so both the training data set have a very similar length in such a case the model is consistent because the the difference in the prediction is quite low but the line may not be a correct representation of the points and hence it adds to an error which is bias error and If we assume a polynomial curve the difference in prediction is huge The predictions may not even be true on test set and situation like this give rise to variance error So bias-variance tradeoff is trying to get an optimal bias and variance for the model when we try to increase bias the variance decrease and then when we try to increase variance the bias decreases so getting an optimal bias and variance is a task so the question remains how do we do it? And the answer is not one It actually depends on the training algorithm so some common ways to do it is by reducing the dimensionality of data so reducing the dimensionality of data removes some features which add to the variance Regularizing linear regression and artificial neural networks also help. Using mixture models and in ensemblence methods are one of the most proven ways to improve the model in this aspect. In case of K Nearest neighbor algorithm using the optimal value of K can optimize bias and variance The bias and variance concept become very important while choosing a training algorithm for your problem So that is it for this tutorial hit the like button if this tutorial helped you lots of new videos coming up So you might also want to go ahead and hit the subscribe button and if you have any questions feel free to ask them in the comments below. Thank you
Info
Channel: The Semicolon
Views: 30,269
Rating: 4.8561153 out of 5
Keywords: Bias Variance Trade off, Bias, Variance, Machine Learning, Data Analytics, Data Analytics with python, over fitting, under fitting, accuracy, data science, bias vs variance, bias variance machine learning, bias variance explained, tradeoff, improving accuracy, bias variance tradeoff, bias variance bulls eye, bias variance tradeoff visualization, Bias–variance Tradeoff
Id: lpkSGTT8uMg
Channel Id: undefined
Length: 5min 24sec (324 seconds)
Published: Sat Oct 07 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.