Welcome everyone to the class of marketing
research and analysis and today we will be continue with our from last session, in the
last session we discussing about regression and we have done with simple regression and
then we started with multiple regression, we did with the multiple regression is the
case when there is a more than one predictor variable and there is the single dependent
variable. So, when you have more than you know one independent
or predicator variables, then it is case of multiple regression, but then the question
there are few special cases are during multiple regression test which we need to check and
today will be covering that so how to conduct that multiple regression which we are solve
through some cases by hand and then did to SPSS also but then the one problem that arises
is to sometimes we need to understand which predictors are most important during any for
any during any study right already researched our study. So, to do that we have 2 methods which will
be conducting today will be doing today one is call the stepwise regression and other
is call the hierarchical regression, so what stepwise regression means and what hierarchical
regression means I will be explaining both of it systematically, sometimes people think
it is a very confusing and some people they think it is very easy now it is none of it
actually you need to understand what exactly stepwise means and what are other ways of
entering the data in fact stepwise is nothing but and hierarchical is nothing but way you
enter the data into the regression equation right in the case of multiple regression so
let us see, so as it is says. In multiple regression contexts, researchers
are very often interested in determining the best predictors in the analysis. So which
is the best predicator I have 3 predictors are 4 predictors let us say, predictors are
my independent variables which one is the best out of it how to do we know. So, the
researcher may simply be interested in explaining the most variability of the one which has
the highest in the dependent variable with the fewest possible predictors. So, if I can predict, I can create a model
which can I explain me the highest variance right the variations with the lowest number
of predictors or independent variables then it is more it is a good model and why because
in research one of the conditions is that research says that it should be parsimonious
in nature that means you should not be utilising unnecessarily excessive data or excessive
variables ok. So, the idea is to have the best output with the minimum input ok. Two approaches to determine the quality of
the predictors are as mentioned earlier stepwise regression and hierarchical regression let
us see. What is the stepwise regression? Let us see, so this frequently used in psychological
research to evaluate the order of the important of variables and to select useful set of variables.
So, what is says in which order should I enter why is the order important? The question that
I might be coming to your mind when I am saying this wise the order important. If I place in any way the variables how does
it matter it does matter it matters because when I am putting in the variables in haphazard
manner then it what happens is suppose I put in the first variable and then the put in
second variable so what happens is when I put in the second variable automatically there
is the change in the first variable also in the coefficient in the first variable. So, because of this I should always have a
logical flow of entering the variables. So, if I am using a logic let us say the first
variable is the one which has the highest explanation power then the second one has
got the second best explanation power the third is the third best explanation power
then what happens I am more in a logical way I am able to explain the things. It involves
developing a sequence of linear models through variable entry as determined by computer algorithms
that can be viewed as a variation of forward selection. So, if you have done regression on SPSS or
Minitab you will see there are 2 methods called the forward and the backward the selection
method. So, the forward selection method is very very close to the stepwise you know regression
method but then what is the different there must be the some difference otherwise would
not have been 2 names correct I will be explain that. So, but understand in forward regression selection
method what we do is we try to select the first predictor variable which has the highest
R square of the highest explanation, since R square is my explanation, explanatory power.
So, highest R square which shows I will use that as my first variable. Second I use my,
the next one which uses the second highest explanation power. So, and I go on till the
end right and the condition here that I am not omitting any variable I am just entering
the variables during the forward selection. In the case of backward selection what happens
I take all the variables at one time and find out the R square and then I find out which
one of them is weakest and I start eliminating one by one till I reach a model after which
there is no improvement ok. However true stepwise entry differs from forward entry in that at
each step of a stepwise analysis the removal of each entered predictor is also considered. So, different between forward and stepwise
is that although the look similar the approach is similar difference is that in the forward
as I had said you do not think of removing anything. So, if are there are 3 variables
the first, second, third I am including all the 3 in the study and I am trying to check
what is change in the R square when I am introducing X 1 then I am introducing X 2 and finally
X 3. But in the stepwise the difference is that
when I introduce X 1 nothing is there but when I introduce X 2 I will try to see what
is the change in now X 1 whether X 1 is actually now contributing or it is no more contributing
why will it happen why will such a situation occur you need to understand that this is
sometimes you can understand as this is nothing but an interaction. So, when a new variable
is entering because of the interaction are the may be some kind of correlation between
these 2 variables because of the which there their own property changes. So, now the property because of the change
in the property it could so happen that this new the earlier variable which was the strongest
of variable as some of became a weak variable because of the presence of this. In for example
I can quote from the indian mythology there was the Mahabharatha war in the Mahabharatha
war Bhishma the character called Bhishma was the great warrior. So, Bhishma was enter in to the you know war
as the leader, he as the defence head. Then one time they found to in order to defeat
Bhishma they introduce a new variable was introduce this character was a transgender.
So, Bhishma head promised that I would not fight to anybody who is a lady, she was a
half lady so I would not fight. So, with the presence of this X 2 new variable when this
new variable came into play the X 1 which was the very powerful variable. Which was the explaining highest or was the
defence leader it automatically became a weak character and because of the presence of X
2, X 1 became a redundant character and it was removed, Bhishma was killed in the war
he did not raises bow and arrow. So, sometimes a very powerful variable can also became weak
due to the presence of a new variable. So, that is the stepwise regression considers
the removal also during the process. The entered predictors are deleted in subsequent
steps if they no longer contribute appreciable unique predictive power to the regression
when considered in combination with the newly entered predictors. So, if in combination
how are the performing sometimes the performance may increase which is the symbiotic relationship
sometime the performance may be decrease because of the coming of the new variable. Step limitations what is the limitation of
the stepwise method it is not gives the method is not without limitation you should not blindly
get in to this method, stepwise regression will typically not result in the best set
of predictors and could even result in selecting none of the best predictors, how will see
I will give an example. The order of variable entry can be important how do you choose the
order of variable you can choose it through the correlation
highest correlation value or just take the t value now how do you do that. So, simply
you take Y you take X 1 ok, find the regression suppose there are X 1, X 2, X 3 right. So, 3 independent equations I will form I
will run it Y and X 1 one equation Y and X 2 another equation Y and X 3 another equation
and I will see the t coefficient in all the 3 cases. The one which is the highest I will
select it has my first entry variable suppose X t is having the highest t value then that
one will be enter into the stepwise regression first ok X 2. Second after doing this I will again run a
regression equation, now how do I run I will run by a combination X 2 and X 1, X 2 and
X 3 ok. Now so 2 equations one this and other is this, so when I am running now I will see
which one is giving me larger contribution. So, accordingly suppose a X 2 and X 1 is giving
my larger contributing t value then I will select X 1 is the next entry variable. So, first I entered X 2 then I entered X 1
and the remaining is obviously X 3 ok this is how you do. Each time you do regression
so there will be multiple regression equations, so sometimes it is very difficult do it by
hand you can use the software ok. If any of the predictors are correlated with each other,
if the predictors they independent variables are correlated the relative amount of variance
in the criterion variable that dependent right explained by each of the predictors can change
drastically. If they are correlated then if it is the positive
correlation may be it will improve it is the negative correlation it will decrease, when
the order of entry is changed that is what it says. Let us take this example stepwise
selection of a team of let us say basket ball players or cricket players whatever, first
picks the best potential player, so which we did it right for example X 2 then in the
context of the characteristics of this player in the context of the characteristics. Because you have taken already X 2 now we
have to take X 1 or X 3 right picks the second best which you did and then proceeds to pick
the rest of the 5 players in this manner may be this is the basket ball team ok. So, basket
ball team so I need 5 players so let us say any sports you can think of. So, I have select
the 5 players ok alternatively there is another method is also 5 potential players which play
together best as a team are selected. So, one way is through the stepwise what I
did was I can select one by one the best than the next best than the next best up to 5 people
or I can also select the team of 5 players by using my logic, sometimes the best 5 may
not may be best on paper but when it comes reality on the field there combined team game
as a team they do not perform well. But they can be other 5 team players who when they
join together that makes a very formidable strong team. But if you go by you know the first method
then they will not be selected, so the question is alternatively says 5 potential players
which play together best as a team are selected. The team that is picked via this method might
not have any of the players from the stepwise picked team, and could perform much better.
So, what I mean to say here is although stepwise method is very good method and very powerful
method. But sometimes stepwise method does not heal
the desired result right and rather it gives to very distorted result which people ignore
they try to blindly use the methods and that is where they you know get into trouble, so
how to do this stepwise regression and then I tell you what is the alternative so which
here we said the 5 potential players which plays together is a team. So how do I select
it as a team? So I will explain that will come through the hierarchical regression method. So, let us first complete this stepwise regression,
how to do this in SPPS. So, this is the case which have taken from a book I was reading
so it was this is I know related with how much time a person takes to graduate how much
year or time takes to graduate these dependent year to graduate right graduation its dependent
on its says factors like the mother‘s educational level. Mother’s educational level is this
one X m let us say + father educational level X f + parents income X I + the faculty interaction
so X F faculty interaction those 4 variables. It is says will effect the dependent variable
the criterion variable graduation graduating time how much time takes to graduate. So how to do that so you can see I will just
show you then I will get into SPPS. So, you go to linear regression and then you take
all the variables the dependent variable to the dependent side and the independent variable
to the independent side and here you use the stepwise. So, let me go set the data set, this is the
data set I was talking about, so years to graduate has been given right this has sum
of the years to graduate the education of the mother is given to you the fathers education
is given to you. The family income is given to you and the faculty interaction level is
given to you. So, what I can do is I can simply first understand the correlation among this
variables I can just simply go to analyze and do a correlation and I will take let me
first change the variable view. So, this is also scale this is not ok. So
analyse correlate by Bivariate and you take all the variables right 1 2 3 4 5 right and
we will run a correlation and just see how this looks.
So, if we look at correlation you can see the years of graduation which the dependent
variable is strongly correlated with the mothers education may be minus is the negative so
do not worry because what does it means negative means 1 unit year increase that means with
unit increasing in the mother’s education let say there is a decrease in the years graduation.
So it as, it should be inversely related logically also you think if the education of mother
is more than we would see can contribute more to the child and the years to graduate should
come down so that say the negative relationship. Similarly faculty interaction if it is more
the time taken should be less so it is a negative relationship. So, -8 point if you look this
looks to be the highest correlation, seconds it looks to be the mother education, which
is highest correlated. Year of graduation and father’s education is very, very poorly
correlated family income is also correlated right. Now among the others you can see like
for the mother’s education father’s but this is not for interest ok. So, lets us go
back to the dataset, now how do I run my stepwise regression, so regression I go to analyze
linear. So, this my dependent variable and I take all my independent variables here,
so I have taken all my independent. I will ask the computer to do it for me now,
so statistics so I need to see here you can check for you know descriptive, partial correlation.
If you want if you have doubt you can check for collided diagnostics also what will it
give you it will tell you the multi-collinity problem. So, why let us to go ask for change
also, so continue I would want the save anything so I just want to run it.
So, this is the same correlation think that you have seen I had shown you. Now let us
go down this is the descriptive statistics look at the years graduation of the graduating
time is 5 the mean is the 5 years standard device this much the education of mother is
around 5.45, 1.5 so this is the sum of the descriptive statistics.
Now you see as I said the difference between a forward selection method and a stepwise
regression method was what? That in the forward selection method you do not omit or remove
any variable all are entered. But in the stepwise you may remove the variable after you take
the combined effect and see which one is not now contributing ok. So, let us see all my
dependent variable is time to graduate and my all other variables are entered. So, I
did it only a stepwise enter I would have directly enter all them all of them into one
so but I am interested to do it stepwise so I should have done this.
So, this will not change right let us go down so you see it was not showing anything but
here you see what is says first faculty interaction was taken. So, you can do it the way from
the correlation also you can see that the highest relationship was between years of
graduation and faculty interaction, so you can do that second mother’s education, third
family income, fourth father education and this was the way the data was entered. So, now model was created like this so the
first case faculty interaction was taken, second case faculty interaction and mother
education was taken, third faculty interaction mothers education and family income was taken,
fourth all the four were taken but in the fifth case faculty interaction was removed
why? Because faculty interaction is because of the presence of this now in the combined
effect is coming down but if you remember this was the first model this was first variable
that to was entered that means it was the most powerful case but still it is removed.
Now let us look at the R square values so when I am using in the first model case you
see R square this much and R square change will remain the same. So look at the change
in first model to second model 788, 696, 788 improved, 875 4th model 918, and 5th one 916
so the differences is hardly in difference and this is non-significant. So, the stepwise
model this is the problem although it as told you some information that have given you. The problem is that the variable which was
the highest predictor or the best predictor, now itself has been removed from the model
the final model. So, this is what the problem with the stepwise regression.
So you see this is what I want to show you. So look at this model, so, this is the coefficients,
this is the t values, so faculty interaction the t -13, then you can look at it as it is
improving, so the significance value you see the fourth model come to the fourth you can
see the faculty interaction is becoming non significant. So that means it got no effect.
So, in the fifth one it has been removed. But when you removed it, what has happened?
Has it done any good we cannot see, so but at least I know one thing that variable which
was contributing my highest is no more a part of my final model, so how can I it be considered
as a good model. So, that is the problem which is happening in the stepwise regression, now
what I will do is, will go back to the PPT. So this is how you do, so this is how, so
this was the table which would I have come from the same table I have copied here and
pasted. So, Interpretation, the inspection of correlation
between the variables in the correlation table show that mother’s education, parent’s
income, faculty interaction are all highly correlated. But father’s education is only
slightly correlated .04 something it was there. Also most of the predictor variables are correlated
with each other with one correlation coefficient as high as .747. You can see .747, so which
is .4747 faculty interaction with mother education this one.
But if you look at the years of graduation, so this is high this is very poor father’s
and this is also high and this is highest. But ultimately in your final model it has
been removed and this is how the final model looks like.
And interestingly you this is what I was saying is, so in the last model, in the last the
e 1, 2, 3, 4, 5 a, b, c, d, e last same. Here this faculty interaction variable now as is
not present here, it has been removed. But it was my highest predictor best predictor
earlier. So this is what it says variables removed. The predictor variable that is the highest
R with the criterion variable dependent variable faculty interaction is the first variable
entered, however the final model does not include and has becoming significant. Thus
stepwise regression results in a bad model or awkward model. You can say not bad I would
not say awkward model that does not include the predictor variable that has the highest
correlation. So, sometimes it is can create a trouble. But this problem can be eradicated by using
a hierarchical multiple regression method. Now what is the difference between this 2,
I will just maybe described because I am running short of time and I will run it in next class.
So, similar to the stepwise regression the hierarchical regression is also a sequential
process involving the entry of predictor variables in steps. But the order of variable entry
is based on theory. See this is the difference, earlier you have doing on bases of some the
variance explained or the correlation or something, right the t value. But here the computer is
not doing anything, here you are thinking which should be the logically the one which
should be taken together. So, the researcher chooses the order to enter the variables based
on the theory and past research. This is the basic different between stepwise
and hierarchical regression. So, hierarchical regression is an appropriate tool for analysis,
when variance on a criterion variable is being explained by predictor variables that are
correlated with each other. Suppose there are variables which are correlated with each
other. For example x1, x2, x3, x2 and x3 are correlated. So then you have to consider that while doing
the analysis. It is a very popular method used to analyze the effect of a predictor
variable after controlling for the other variables, so how do we control I will show you. This
control is nothing but it is achieved by calculating the change in the adjusted R square at each
step of the analysis. So, at each step we will measure the R square and see whether
there is an improvement or no improvement. So, the researcher defines order of entry
for the variables based on theory. Sets are independent variables are entered in blocks,
so maybe a block can be only a one variable also or could be a combination of 2 to 3 variables
which theoretically it say that go together and R square change method. May enter sometimes
nuisance variables first to control for them, then test the purer effect of next block of
important variables. Use your; it is manually you can use your logic, so theory is always
most important thing here. So, it is a way to show if variables of your
interest explain a statistically significant amount of the variance in your dependent variable
after accounting for all other variables. So, it takes to you an account how much of
variances has been explained. This is the framework of model comparison now this is
very important term you should understand. It helps you not only statistically test but
tells you which one is the best model. In this framework you build several regression
models by adding variables to a previous model at each step, same like stepwise only here
up to this much, but only difference is that we are using the logic here theory. In many
cases, our interest is to determine whether newly added variables show a significant improvement
in R square. So, here I would tell you something before
I crossover stop it now this improvement in R square which I am thinking of is there you
must hard of this word R square and adjusted R square, so I will try to explain the difference
between these 2, so what is the R square? I do not get a space, how do I measure R square?
R square is equal to what now 1 - sum of square of the error variable divided by sum of square
of total, correct. So, when in a regression equation let say
Y = a + b 1 X 1 + b 2 X 2 and I go on increasing, as I increase my number of independent variable
up to X n, see some are the other correlation will be there, so as you improve the increase
the number of variables the R square will go on increasing. Because it is not possible
that there is completely no relation in social science it is next to impossible. But that
is the very dangerous thing. So, if I am using that means if it is contributing
.0001 still it is been taken and I had been added as a new variable R square increasing.
So R square can sometimes be a very confusing variable. So, instead of doing that we use
another which is called adjusted R square which you must of seen in all the result section.
So what is the formula for adjusted R square let me tell you 1 - (1 - R square) * (n-1)
/ n - p - 1. Now you see, now what is this p? The p is
the number of the independent predictors or independent variables, so number of predictors,
so as I said to you if I am increasing a variable suppose there are earlier 5 variables, now
I am using let us the 6 variables, so what will happen, so automatically my R square
will tend to increase. So if my R square is increasing so 1 - R square this value and
* n – 1 / n - p – 1. So, what is happen one side my R square is
increasing but the other side since this is the denominator if it is increasing this will
reduce the entire value. So, the adjusted R square gets somewhere normalized or adjusted
with this one side increase of the R square and that increase in the denominator which
pulls down the entire value. So, in the process if the value is not contributing much this
becomes more dominating. And as it becomes more dominating obviously
what happen’s the R square value does not increase rather it may start falling if the
new variable is not contributing significantly to the study. So, therefore it is always advised
to use the adjusted R square instead of R square doing any research study, well what
I will do is, I will carry on this session in the next lecture where I will explain hierarchical
regression little bit more and I will show you how to conducted and the SPSS, ok thank
you very much.