Multiple Linear Regression Model

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

welcome to the lecture today we are going to start with a new topic which is multiple linear regression modelling. if you recall we started with the simple linear regression model, where we consider the situation where the outcome is going to be depended only on one independent variable now we are going to extent it. in practice this situation is more realistic, the outcomes usually depends on more than one factors or more than one variables, so we are going to consider here a situation where the outcome is going to depend on more than one independent variables. the situation is the following that in the case of simple linear regression modelling we have developed many concepts and i have tried to explain you there utility and their interpretation, the same concept, the same interpretation will be brought forward in the case multiple linear regression modelling, so it is my request that before you start with multiple linear regression model it is very important that you are clear about all the concepts of the simple linear regression model. here we believe that the outcome which we had denoted as y this depend on more than one independent variables and earlier we had discussed the simple linear regression model that was beta0 + beta1 x + epsilon, now here we assume that there are more than one independent variables and suppose there are k independent variables, and we denote them by here x1, x 2 up to here x k. so the same model which we have considered in the case of simple linear regression model this can be extended to the case when there are more one independent variables, and this can be written as see y is = beta0 + beta1 x 1 + beta2 x2 up to here say beta k x k + epsilon. now about the interpretation means earlier we had said that this beta 0 is the intercept term and this remains the same here. and we say now that beta1, beta2, this beta k they are the regression coefficients associated with x1, x2, x k respectively, so essentially this beta j is the regression coefficient associated with jth explanatory variable xj, and epsilon because of the same thing as our random error. now in this case the role of random errors becomes quite important when we are dealing with the real life situation. the first step in doing a regression modelling is to identify what are my independent variables or what are variables, which is going to affect the outcome why? when we try to do so sometimes it is possible to obtain the observations on those variables and sometimes it becomes difficult to obtain the observations on the independent variable. for example if i take a variable like taste or intelligence. it is difficult to obtain the numerical values on the variables like taste or intelligence. the intelligence is usually measured by iq scores, but we are against a sort of indirect major of intelligence, similarly there are some variables which may not be very important or they may have a very small affect on the outcome y based on that some time we try to consider them or sometime they don’t consider them. on the other hand in case if the number of explanatory variable become very, very large the situation become more critical and in that case we would try retain only the important variables which are trying to affect the outcome y. there will be many, many situations which are beyond our control and epsilon denotes the joint affect of all such factor which is beyond our control. so epsilon is like a basket in which we try to put all those things which are beyond our control. this epsilon essentially depicts or it reflects the difference between observed and fitted model and this area goes exactly on the same lines what we had done in the case of simple linear regression model. the situation in which we can use this multiple linear regression model are many. first of all i try to extent the same example which i had considered in the case of simple linear regression model in the case of simple regression model i had taken an example of yield of a crop where i denoted y as the yield of crop and we had taken x as the quantity of fertilizer, but do you really thing that the yield of a crop depends only on the quantity of fertilizer, but it depends on several other factors so now we have an opportunity to incorporate all those important factor which are affecting the yield of a crop, for example the first factor i can write down x1, which is my quantity of fertilizer similarly x2 can be level of irrigation. third thing can be the quantity of seeds, x4 can be rain fall and similarly you can identify some more important factors which are affecting the yield of a crop. now under this things we have to now develop a multiple linear regression model, you may recall regarding the case of simple linear regression model the first step what we had defined for a linear regression model is to obtain data and in case of simple linear regression model we had obtain the data on y and x. we conducted an experiment, we provided the value of x and then we had observed the values of y and this experiment was repeated n times. the same has to extended here also that we have to conduct the experiment, provide the values of x1, x2, x k and then record the outcome y, so if you try to see earlier we had set of observation like xi, yi but now we are going to have a set observation ray which is something like xi 1, xi 2 and up to here say the x i k and y i, and i goes from here one to n in case if try to repeat the observations n times. earlier we had a assumed that all the observations they will also follow the same model, and in the case of simple linear regression model we had the model y is = beta0, + beta1 x + epsilon and we assume that all observations xi, yi they are going to follow the same model and they will satisfy yi is = beta0 + beta1 x i + epsilon i. we have to extent the same definition in the case of a multiple linear regression model, so let us first set up our model. so we consider the model set up it as simple as that conduct experiment n times, and we are going to obtain here the values of y and x1, x2 up to here x k this is how we are going to obtain our values suppose i conduct the experiment and i give x one a value say x1 1 x2 a value x12 and x k a value x1 k and based on this values we try to observe the outcome y and we denote it here as y1, so this is our first set of observation. similarly we try to obtain the second set of observation that we try to give the value x1 as a x2 1, x2 the value x2 2 and say x k the value x2 k and we obtain the observation y2, so this gives us the second set of observation and we continue with this thing and finally we obtain the nth set of observation by giving x1, x2, x k the values x n 1, x n 2, x n k and we observe the value here y n so this is the nth set of observation. what is this actually mean for example if i try to take the same example of yield of crop, so for example if i say x one is my quantity of fertilizer, and see here x2 is my irrigation level and x3 is my suppose seeds, what we try to do here suppose i try to give 2 kilogram of fertilizer and say ten centimetre of irrigation suppose i use one kilogram of seeds and based on that we try to observe the yield and we get suppose here forty kilogram of yield. this is why x1 1 this is my x1 2 this my x1 3 and this is my y 1 and similarly i can repeat this experiments and i can take say i use 3 kilogram of fertilizer say this 15 centimetres of irrigation 2 kilogram of yields and based on that we observe suppose 50 kilogram of yield so this will be denoted here as a x2 1, this will be x2 2, this will be x 2 3 and this will be y2. similarly we try to repeat this experiment n times and we obtain n sets of observation, so now if you see we have here a model, which y is = beta0 + beta1 x1 + beta2 x2 + up to here beta k x k + epsilon, now we assume that that each set of observation satisfy this model. this means i can express for the first observation i can write that y1 is = beta 0 + beta1 x1 1 + beta2 x1 2 + beta k x1 k + epsilon 1. similarly for the second observation i can write down the model as a beta0 plus beta1 x 2 1 + beta2 x2 2 + beta k x2 k + epsilon 2 and so on for the nth observation i can write down y n is + beta0 + beta1 x n 1 + beta2 x n 2 + beta k x n k + epsilon n. so essentially if you see here we have got here n equations, now these n equations can be expressed in the form of a vectors and matrix, so we can write down this n equations as follows, let us try define here vector of y1 y2 y n and this is = so we define here one matrix and here we define here a vector beta 0 beta1 beta2 up to here say here beta k and based on that the first row of this matrix will be 1 x1 1 x1 2 up to here x1 k. the second row will one x2 1 x2 2 x 2 k and similarly the third row will be x3 1 x3 2 up to here x3 k and this will continue up to here x n1, x n2 up to here x n k and + epsilon one epsilon 2 epsilon 3 up to here epsilon n. so now i can denote this vector as y and this matrix here as x this vector here has beta and this vector here as epsilon, so i can write down the entire model as a here y is = x beta + epsilon. now we try to observe here that this first column is here only one this is indicating the intercept term this can be made a little bit more general that i can write my model in journal has se here y is = x beta + epsilon and where i can say that x is going to be something like x1 1 x1 2 x 1 k, x2 1 x 2 2 x 2 k, x n1 x n2 x n k and in case if i want to consider the intercept term in the model then the first column of the x matrix has to be made one one one one and in case if i don’t need an intercept term in the model this x matrix will remain as such. so this is a very general form, in which we assume that y is say n cross 1 vector of observation on study variable or let me call say response variable some time x is a n cross k matrix of n observations on each of the k independent variables x1 x2 x k beta is going to be something like beta1, beta2 and beta k, this is going to be a k cross 1 vector of regression coefficients associated with x1 x2 x k and epsilon here is as usual epsilon1, epsilon2, epsilon n which is n cross 1 vector of random errors. for the sake of completeness i can also write here y as a y1, y2, y n transpose, now the question is that in case if i want to have intercept term in the model then what i have to do take first column of x matrix to be one say one and then correspondingly this beta1 will become the intercept term, so now onwards we will start with the model y = x beta + epsilon and we will not bother whether there is an intercept term or not. in case if i wanted the intercept term i simply have to write the first column of x matrix to be one otherwise i will simply continue with the s matrix as the matrix of the observation obtain on the explanatory variable. you may now recall that in case of simple linear regression model we had made certain assumption about the model the similar assumptions we are going to make for the multiple linear regression model, so if you remember the first assumption what we had made was that expected value of epsilon i is 0, now in case of multiple linear regression model we do not have one epsilon i but we have a vector of epsilon i so i can assume that expected value of epsilon is = null vector. the interpretation part of this thing that we already had discussed in the case of simple linear regression model, the second assumption is about the variance covariance matrix, so we assume that the variance covariance matrix of epsilon which is the same as expected value epsilon, epsilon prime this we assume is sigma square i n, so it is something like this it will look like this the diagonal elements. they are going to denote the variances of epsilon1 epsilon2 epsilon n and the half diagonal elements they are going to denote the covariance between epsilon i and epsilon j, which are 0 again, this is the same assumption that all epsilon i's are ours identically and independently distributed, so we can see here from this matrix that we are assuming that all epsilon1, epsilon2. epsilon n they are having the same variance sigma square and they are mutually independent of each other. the third assumption which we are going to make here is that rank of x matrix is going to be the k, and remember k is the number of independent variable so essentially we assume that this is a full column rank, the advantage of making this assumption will be clear to you in the next lecture when we go for the estimation of parameters. the next assumption we make is that x is a non-stochastic matrix you may recall that similar assumption was also made in case of simple linear regression model where we assume that x is a fixed quantity, it is an non-stochastic random variable, so similarly here we are trying to make it more general we have now not one variable but more than one variable so we trying to extent the same assumption of the simple linear regression model to a more general case for all the k independent variables. the last assumption what we make here that epsilon are following multivariate normal distribution with null vector and covariance matrix sigma square i n. this assumption is a gain similar to the assumption what we made in the case of simple linear regression model there we assume that epsilon i's are following our normal distribution a univariate normal distribution with mean 0 and variant sigma square. now we are trying o extent it for all epsilon1 epsilon2 epsilon n, again i would like to emphasize that the utility of normal distribution comes into picture when we consider the maximum likelihood estimation of the parameter or when we go for the test of hypothesis and confidence interval estimation. next we come on the aspect of interpretation of these regression parameters, so we have considered here a model y is = beta1 x1 + beta2 x2 + beta k x k + epsilon and now we have assumed that expected value of epsilon is = a null vector, so i can write down expected value of y to be here beta1 x1 + beta2 x2 + beta k x k and now. here itself you can see the utility of assuming that x k are non-stochastic another advantage of assuming that x1 x2 x k are non-stochastic is that the outcome of the experiment will not be dependent on the values of x1 x2 x k. so if somebody is conducting an experiment in city number one and somebody collecting the observation in city numbers two and somebody else is collecting the observation in city. number three then whatever the analysis we are going to obtain on the basis of collected set of data that is not going to dependent on the city number one, city number two or say city number three right but that will be valid for everyone. now based on this if i try to find out the partial derivative of expected value of y with respect to here certain variable x j this comes out to be beta j. so you can see here that beta j is nothing but the rate of change in the mean value of y with respect to jth explanatory variable. so this essentially denotes the change in the mean value of y when jth explanatory variable changes by one unit, and if you try to recall this is the similar interpretation as in the case of simple linear regression model so whatever interpretation i had given to beta1 in case of simple regression model that is now extended to beta1 beta2 beta k. in case if you say what is the interpretation of having an intercept term in the model so in case if i try consider here a intercept term so i simply have to take here all value of x1 to be 1, in this case, the model will become expected value of y beta1 + beta2 x2 + beta k x k. right, so if try to take all x2, x3 and all other values of x2, x3, x k to be 0 then expected value of y becomes nothing but beta1. so in this case also the intercept term will denote the mean value of y when all independent variables take value 0 and again this is the same interpretation that we had given in the case of simple linear regression model there i consider only 1 variable x to be 0, now am saying that all x1, x2, x k they are going to take the value 0. so we have completed here the description of the model, in the next lecture we will consider the estimation of the model parameters, till then good bye.

Info

Channel: Linear Regression Analysis and Forecasting

Views: 19,587

Rating: undefined out of 5

Keywords: Multiple regression model, study variable vector, explanatory variable matrix, regression coefficients, multiple regression model.

Id: yW-AC4XTchc

Channel Id: undefined

Length: 32min 23sec (1943 seconds)

Published: Sun Jan 15 2017