Bagging vs Boosting - Ensemble Learning In Machine Learning Explained

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hi everyone this is yml and today you are going to talk about two of the most used methods in assemble learning begging and boosting we'll see how each algorithm works and what are the similarities and the difference between the two so let's not waste any more time and let's dig in to start with let's suppose that we have a data set what begging does with this data set is quite simple you create and sub data set of equal size by sampling with the replacement from the original data set a technique known as bootstrapping and train a classifier on each sub data set then at the end we use all these models to make a prediction in an assembler classifier so we know nutshell begging consists of two steps one is bootstrapping the data set and two is aggregating the results henceforth its name begging when used boosting on the other hand the way you create this models changes quite a bit so let's suppose that you have that same data set the first thing you do is to train a classifier on this data set and see which samples were correctly classified by the model and which ones were incorrectly classified then you use this information to weight up the samples which were incorrectly classified by the model so that when you train the next model it pays more attention to those samples and hopefully learns to correctly classify them then you look again at the incorrectly classified samples weigh them up drain a new model look at its predictions weigh up the misclassified samples and so on and so on until you get the desired number of models at the end as in the beginning case you use all these models in an assemble to make predictions on new data now let's see what are the similarities and the differences between the two by firstly looking at how they work so at a high level both Metals Builds an example of models but begging builds them in parallel and boosting boost them sequentially knowing this may help us in choosing which one to use depending on the Computing resources and development time we have at our disposal if we have a lot of computing then due to its parallel nature begin may be a more suitable algorithm since it may take a lot less time to train the models while we might get no significant Improvement in training for boosting due to its sequential nature another 13 that we might consider when looking at these two Ensemble learning methods is the data set on which they train the classifiers so both models builds a separate data for each model but they do it differently begging uses a subset of the original data set that is generated by same thing with replacement while boosting uses the same samples as in the original data set also another important distinction is that in back in the samples are unweighted while in boosting they are weighted in regards to the predictions given by the previous classifier how each Ensemble makes the predictions is yet another important Dimension to analyze so both methods make predictions by taking the average of the models butting back in the classifiers are equally weighted while in boosting the models are weighted in The Ensemble based on their training performance as in any machine learning problem the bias and variance of the system plays an important role in the final performance in our case because they are an enzyme pod backing and boosting are good at reducing the variance however begging has closely zero bias reduction the reason being that because we don't change the voiding of our data when sampling the bias of the individual model is transferred to the assemble this doesn't happen in the boosting case since the samples are waiting from one model to another but unfortunately this makes boosting more prone to overheating in comparison with pegging I know that I may have lesser questions unanswered in this video and things like why do we sample with replacement in begging or why is boosting prone to overheating more exactly may have popped in your mind well I've done that on purpose mostly because you may notice a thing that this question have in common the why which is the main theme on this channel so I intend to make videos about these subjects in the future this was the video for today I hope you enjoyed it please leave a like if you did share with me your thoughts in the comment section subscribe to be up to date with the new content and until next time I wish you a wonderful time bye bye

Info

Channel: DataMListic

Views: 54,378

Rating: undefined out of 5

Keywords: deep learning, machine learning, artificial intelligence, ml, dl, ai, data science, ds, why ml, why dl, bagging, bootstrap, bootstrap aggregating, boosting, ensemble learning, boosting vs bagging, bagging vs boosting

Id: tjy0yL1rRRU

Channel Id: undefined

Length: 4min 23sec (263 seconds)

Published: Sat Jan 14 2023