Boosting Explained-AdaBoost|Bagging vs Boosting|How Boosting and AdaBoost works

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
let us start the discussion with some of the questions from my previous videos for example what is an assemble learning so an sample learning is basically a learning technique in which multiple individual models combine together to create a master model that process is called an assembly what is bagging and boosting these two are the techniques or ways to implement and symbol models and what is random forest random forest is an implementation of bagging technique welcome to and full data science this is among here and I am a data scientist in this particular video I will discuss with you the other and assembling technique known as boosting and one implementation of that technique known as adaboost let's start so having said bagging and boosting and ensamble learning the very first thing I want to cover here is what are the differences between bagging and boosting or to be more specific let us take one example from each set and I'll say the differences between random forests and adaboost so random forest is a baggy technique and adaboost is a boosting technique okay both of these come under the family of ensamble learning so the difference number one between random forest and adaboost is random forest is a parallel learning process okay whereas adaboost is a sequential learning process so what is the meaning of this the meaning of this is in random forest the individual models or individual decision trees are built from the main data parallel independently of each other so I had explained random forest in one of my video the link for which is right here you can go ahead and watch in random forest multiple trees are built from the same data parallely none of the tree is dependent on other tree hence this process is called a parallel process on the other hand the sequential process which we following adaboost one tree is dependent on the previous tree which means if there are multiple models implemented machine learning model one machine learning model 2 machine learning model 3 as a process of assembling then model 2 will depend on output of model 1 and similarly model 3 will depend on output of model T this process where all the models are dependent on each other or the previous one is called a sequential learning this is one basic difference between banging family and boosting family or to be more specific random forest and adaboost what is the second difference the second difference between these two techniques are let's say there are multiple models fit and all these models combine to make a bigger model or master model so all the models have equal say in the final model in the random forest for example if there are 10 models okay if there are 10 models created or 10 decision trees created in the random forest then all these 10 models will have equal say or equal vote in the final algorithm which means all these trees are same for the final final model on the other hand in adaboost all the trees or all the models do not have the equal say which means some of the models will have more say in the final model and some of the models individual models will have less say or less weight is in the final model so all the models do not have equal say or we can say weight is in the final model this is a difference number two okay the difference number three between random forest and adaboost is in random forest all the individual models are one fully grown Gentry so when we say machine learning model one or decision tree model one in random forest that's a fully grown decision tree on the other hand in adaboost the trees are not fully grown rather the trees are just one root and two leafs specifically these are called stumps in the language of adaboost so this will have fully grown trees random forest will have fully grown trees has part of its individual models on the other hand in adaboost what we will have is we will have just stumps okay so these stumps are nothing but one root node and two leaf nodes so these are three basic differences between how a bagging algorithm works and how a boosting algorithm works so this is a parallel algorithm parallel learning such a sequential learning here all the individual models or in other words we can say all the week learners these are called weak learners in the world of machine learning so all these weak learners have equal say in the final algorithm on the other hand here some of the weak learners have higher weight as compared to other weak learners we will see how that weight changes with an example now and third is all the individual models are fully grown decision tree here the individual models are not fully grown basically they are one stump which means one root load and totally floats okay now we will take an example and try to understand how adaboost works ok so I am considering a simple example here you can see this data it has three columns age BMI and gender okay so this is a Dominator that I have created here gender is the target column or response variable and these two are the independent variable let's say we try to fit a boosting algorithm or adaboost on this data so how it will work is the very first thing the algorithm does is it will assign you to all these records okay so I am writing here initial weights or high W I am calling it so what this initial weight would be it will just be if we sum all these weight it should be one which means one should be divided between five records so it has to be 1 by 5 1 by 5 1 by 5 1 by 5 and 1 by 5 so initially this weight will be assigned to all these records which means all these records are equally important for the model ok now as I told adaboost is sequential learning process so what will happen is the first model or first week learner or first base model will be fit on this data so as I told here in adaboost the week learners are basically stumps ok and how is terms look like is stamp will have one root and two leaves ok so that's all this is one week learner the very first week learner now how do we create this one stump or one root and two leaf nodes the fundamental concept remains same gimme index entropic and whatever we take it will so these two columns are the candidate columns for creating this root node so Guinea index or entropic and will be checked then a condition will be selected and then this stump will be created so I are discussed in detail about Guinean entropy in math this video you can go ahead and watch once this stump is created then what will happen is this data will be tested for accuracy on this stump okay and there is a possibility that when the testing happens on this data on this training data when the testing happens then some of this classification which this stump will do might go wrong okay so let's say this stump is created and then the testing happens on this data and when the testing happens let us say this diction is correct this petition is correct this prediction is wrong this is correct and this is correct okay so what happens in next titration days these initial weights are changed this is very important for a double technique so initially we started with giving similar way to all the records which means all the records were equally important for the model but what happens in the next generation or next model is something which has been misclassified in this case this particular record has been misclassified by the previous model so what will happen is the weight for this record goes up okay and to normalize the entire weight the weight for all these records come down whichever is correctly classified so what is happening here is in the next model in the next week learner the more importance is given to previously miss classified records and what happens in next generation or next week learner is this particular record will be tried to classify correctly with more weight ease so the next learner will focus more on this particular record now here we are talking about five records only what if we are talking about 1 million records there will be some good number of records which were misclassified by this weak learner and hence those records will be given higher ratings for the next learner or next machine learning model 2 or next week learner similarly this machine learning model - will misclassify some of the observations those observations will again be given more weight and other observations weight will be coming down to normalize it to make it 2-1 ok and similarly model number one model number two model number three will get created and whatever the misclassification happens in the previous model the next model will try to classify it so this is how in sequence one model takes the input from the previous model and tries to classify so this is how adaptive boosting and that is why the name is adaptive boosting because it adapts to the previous model so whatever previous model is not able to classify this model tries to classify so in the end the final model is the model which is a combination of all these learnings and hence this technique is called boosting technique and this algorithm is called adaboost so the important thing to understand here is the initialization of weight and adjustment of weight based on miss classification the internal fundamental concepts of creating decision tree creating stumps remain same like Guinea entropy all those things but what is different here is these weights and it's adjustments okay so this is how adaboost works if you have any doubt or net opposed to working methodology I have no England not gone much into mathematical details of how these weights are adjusted and keeping it simple I'm just calling it going down coming up there is a formula through which it happens so I can give you that as well but I am giving it simple I am just saying it weight goes down and comes up so if you have any question on adaboost technique just let me know through the comment and I will implement the fed a Boston Python and show you in my upcoming videos I'll see you all in the next video till then take care [Music]
Info
Channel: Unfold Data Science
Views: 42,466
Rating: undefined out of 5
Keywords: machine learning Unfold data science, boosting machine learning, boosting algorithm, adaptive boosting, adaptive boosting machine learning, boosting algorithm example, boosting algorithm explained, boosting vs bagging, ensemble learning boosting, adaboost, what is boosting in machine learning, what is boosting and bagging, machine learning, boosting and bagging machine learning, unfold data science
Id: HRBMlBiOo7Q
Channel Id: undefined
Length: 12min 31sec (751 seconds)
Published: Tue Feb 11 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.