Tutorial 43-Random Forest Classifier and Regressor

Video Statistics and Information

Video

Captions Word Cloud

Captions

hello all my name is Krishna and welcome to my youtube channel today we are going to discuss about random forests now in my previous video I have already put up a video on bagging and I told you that one of the technique that is basically mostly used is something called as random forest so random forest classifier or a regresar is basically a bagging technique and we are going to discuss both of them in this particular session so let me just consider and let me just show you some example suppose I have a data set now how does random forest basically work suppose this is my data set D now I told you that in bagging we basically have many based learners based learning models so this suppose this is my m1 model this is my m2 model this is my m3 model and many more models like this okay so suppose this is my MN model now when we are designing this particular model in the random forest this model is basically called as decision trees we are going to use decision trees in this model and as I had explained in the bagging technique suppose in this particular data set we have the D records okay D number of records and M number of columns suppose I have that many columns n number of columns so what we do is that from this particular data set we will be picking up some sample of rows and some sample of features okay so initially I will pick up some sample of rows I will say it as row sampling row sampling with replacement I just say what is that particular placement term means so I'm going to take some rows from this particular data set and similarly I am going to pick up some columns okay or I can also write this as feature sample so FS okay feature sample replacement now that is how backing words works right we will be taking some amount of rows given to our decision tree one so this is really a decision tree one decision tree two three four and okay so all this decision tree suppose I say that this particular data set is basically D - I always remember when I say D - D - is always less than D because the number of Records from here I'm just taking a sample of Records and suppose if I consider that I have taken and suppose small D - rows and n columns right n number of features so always remember this M will always be greater than n and this D - this capital D will always be greater than D - or small D I'll say it as small D because the total number of Records I have written as D okay so always remember that guys I am going to take some number of rows some number of features give it to my decision tree one this decision tree one will get trained on this particular data set now similarly from a decision tree to what I'll do is that again this row sampling will happen with replacement now what does with replacement mean is that oh here suppose from this particular record I have some of the records some of the records it may come into this particular scenario so when I am doing row sampling with replacement not all the records will get replayed repeated but instead I'll be taking another sample of records and give to our decision tree - so when I'm doing again a row sampling + feature sampling over here it may be it may happen that some of the records may get repeated here some of the features and get repeated here but we are at least changing many records again we are doing this row sampling okay row sampling and feature sampling so suppose in this particular case I had given feature one two three four five suppose in this particular case I will give other features like feature 1 3 4 5 6 7 like that and similarly that row sampling also happens in the similar way now after doing this row sampling and future sampling I will give this particular records to my decision tree - this will get trained on this particular data similarly for every decision tree this thing is going to happen where you are going to perform row sampling and feature sampling ok row sampling and feature sampling now this decision tree gets trained on this particular data ok and now it will be able to give them accuracy or it will be able to give the prediction now the next thing is that whenever I get my test data whenever I get my test data suppose I am giving one record of the test data into this particular decision tree one suppose decision tree one suppose I am considering a binary classification problem decision tree once gives me one this also gives me one this gives me 0 and suppose this gives me 1 okay now when we see over here finally we know that this is my bootstrap and now according to the bagging finally after aggregate it right so for aggregating I am going to use the majority would now when I use the majority would I know that the max of models that has basically st. output is like one so away I can see one two three models is basically saying it as one so finally my output is basically one now this is how a decision random forest basically works the based learner is decision tree now you need to understand one more thing in this what is happening if when we are using many decision trees in this particular random forest because you should know that decision tree whenever I use decision tree it has two properties suppose if I am creating a decision tree to its complete depth so when I do that it basically has low bias and high variance I'm going to explain about what is low bias and high variance just let me write it down for Stauffer so low bias basically says that if I am creating my decision tree to its complete depth then what will happen is that it will get properly trained for our training data set okay so the training error will be very very less high violence high variance basically says that now whenever we get our new test data those decision tree they are prone to give larger amount of errors so that is basically called as high variance okay so in short whenever we are creating the decision tree to its complete depth it leads to something called as overfitting okay so now what is happening in random forests in random forests I am basically using multiple decision tree right and we know that each and every decision tree will be having high variance right but when we combine all the decision tree with respect to this majority vote what will happen is that this high variance will get converted into low variance because now when we are using row sampling and feature sampling and giving the records to the decision tree the decision tree tends to become an expert with respect to this specific rows or the data set that they have okay since we are giving different different records to each and every decision tree they become an expert with respect to those records they get trained on that particular data specifically and in order to convert this high variance to low variance we are basically taking the majority vote okay we are not just depending on one decision tree output so because of that this high variance will get converted into low variance when we are combining multiple decision til now one more advantage you need to understand suppose i have thousand records over here now in this thousand records okay suppose I just changed let me just change two hundred records will this change of the data impact this random forest now understand guys we are doing random sampling sorry rose sampling and feature sampling for each and every decision tree now if I just change to one hundred records now this two hundred records will be properly splitted between this data decision tree so when it is actually splitted then what will happen is that some of the number of roles or some of the number of records will go to decision tree one then decision tree two then three then four so this data change will also not make that much impact to a decision tree with respect to the accuracy or with respect to the output so that is why this high variance even though whenever we change our data whenever we change our test data we will be getting a low variance error or our error rate will be very very low our accuracy will be very very good since we are taking the majority of what we are doing row sampling and feature sampling giving to the decision trees now this is the most important property of random forests so random forest actually works very well with respect to most for the machine learning use cases that you are basically trying to do and I've seen in most of the companies developer have made the favorite favorite algorithm as random forest let it be classifier aggressor one more point I missed out is that suppose if this is not a binary classification it is a regression problem what will happen now this particular decision tree suppose it gives me a continuous value this also gives me a continuous value this also gives me a continuous value for that what we do is that in the regression problem we either take the mean of all this particular output or the median of that particular output it depends on the distribution of the output how the decision tree is basically given so usually the main random forest that works with respect to a scale on it tries to find out the average of this particular output from all the decision tools and that is much a simple it you need to understand if I just use single decision tree it will have low bias and high variance if I want to convert this high by high variance into low variance I have to basically use multiple decision tree apart from that I also have to use row sampling and feature sampling so that I will be able to convert that into low variance that basically is our accuracy for the new data or the test data will be very very good so this was all about random forests and I have explained you both about classifier and regressor only the difference between classifier and regression is that classify uses majority wood I'll just write it down majority wood whereas in the case of regression it will actually find out the mean or the median of the particular output of all the decision trees now the hyper parameter that you have to basically work on in that how many decision trees you have to actually use for the random forest okay how many decision trees you have to basically use so by with the help of hyper parameter you'll be able to work that out okay so this was all about the video of random forest classifier and regression I hope you like this particular video please make sure you subscribe the channel share with all your friends please share with all your friends whoever require this kind of health because all the materials over here are free to share with as much as people as you can I'll see you all in the next video have a great day thank you one and all

Info

Channel: Krish Naik

Views: 192,995

Rating: undefined out of 5

Keywords: random forest algorithm, random forest algorithm python, random forest explained, random forest regression, random forest equation, random forest classifier example, random forest, random forest pseudocode

Id: nxFG5xdpDto

Channel Id: undefined

Length: 10min 18sec (618 seconds)

Published: Fri Aug 23 2019