50. Discriminant Analysis in SPSS

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Welcome everyone to the class of marketing research and analysis, so as we have discussed in the last lecture about a new concept which comes under the you know, ambit of regression, which was called as logistic regression, so we learned that logistic regression is a technique which is used when your dependent variable is categorical nature although, the general assumption of regression is at both dependent and the predictor variable should both be continuous. But we; this is a special case where we understood that sometimes we need to understand in life, whether I need to give permission to somebody to you know join my course or not or whether it will flood or it will not flood, whether it will rain or it will not rain, whether somebody will be a defaulter or not defaulter, so there are several you know occasions in life, should I give you know, should I marry or I should not marry. So, in such conditions where the person is coming with a such a situation or such a condition, he needs to understand, these decisions must be based on certain statistical measures, so the independent variables, by using the independent variables, he tries to come to an outcome and predict whether he should do it or he should not do it or it will happen or it will not happen, so that was a case of logistic regression, right where we calculated the odds ratio and we measured. And then we find out, how this variables were impacting, the outcome variable, so we say and we also understood during that while in the last illustration that the amount of strength or you can say, the strength of the model, right through a classification results, we understood how strong are model was, how much correct early it was explaining the entire study, right, similarly today we will talk about another method which is very close to that logistic regression or has a similar purpose you can understand called the discriminant analysis. As the name you understand from discriminant, so, right what it says the discriminant, to discriminate, right, to you know separate the one from the other, the good from the rest or something like that right, so discriminant analysis is the similar technique like logistic regression but the difference in the between the two is that the logistic regression was performing better when your data is not following the normal assumptions; the normality assumptions, right. But if your data is following a normality assumption, right, all the independent variables are following the normality assumption, then in such a condition, then you have 2 options, right; the logistic and the discriminant, so the question is; should I use the logistic regression or should I not use; should I use the discriminant analysis, so please remember, if your data is following a normality right or the normal distribution, it is always better to use the discriminant analysis instead of the logistic regression. But if your data is not following the normal distribution, then it is better to follow a use a logistic regression instead of the discriminant analysis, this is the basic understanding, right, so, let us start the lecture. So, discriminant analysis is the technique for analysing data when the criterion or dependent variable is categorical and the predictor or independent variables are metric, right that is measured on at least interval scales. For example, a school wants to select or reject some students for interschool hockey tournament, okay the school authority used parameters such as age, height, weight and stamina for selecting those students. So, in this case selecting the students is a clear case where we are using discriminant analysis, right. Now, this is what the meaning of discriminant is, you see now, the men have been separated from the woman or the girls and from the boys right, through some measures and this is what, this line is the mark of difference or the discriminant right, it is separating the 2, right. So, classification of discriminant analysis; discriminant analysis techniques are described by the number of categories possessed by the criterion variable, they are 2 group or multi group. So, 2 group discriminant analyses are when the criterion variable has only 2 categories, so this technique is known as 2 group’s discriminant analysis. So, do user and non-user of social networking sites differ with respect to their age I want to know? So, this is like a binomial logistic regression but the condition I have already said is that here the data the variables; independent variables need to follow a normal distribution, right, if they not do not follow a normal distribution, then in the same case do user and non- user of the social networking sites differ with respect to their age can be done through a logistic regression, okay. Multiple group discriminant analysis or comparatively, you had the multinomial logistic regression. When the criterion variable has 3 or more categories those techniques is referred to as multiple discriminant analysis, right. Example; do heavy, medium and light users of hard drink, right differ in terms of their weight or maybe in terms of their heart you know, condition, right, so we want to check, so there are several such studies in real life that you want to know, right, somebody who is a vegetarian or a non-vegetarian, does it impact their stamina, does it impact their weight, right, you want to see does. Or you can say is the weight of a person affected by the vegetarian or the type of food but here the case is slightly different, it is the reversal, right, here are independent variable is rather a you know categorical and the dependent variable weight is a continues but just assume the inverse of it where the dependent variable is categorical and the independent variable is continuous, so here heavy medium and light users of hard drink; do they differ in terms of their weight, right? So, can we find out some relationship that is where we use a discriminant analysis, so this is the class of a two group discriminant analysis and this is a three group discriminant analysis, multi group or this is the; here this case is the three group discriminant analysis, right. So, let us see what we exactly do in this case; the development of discriminant functions, the objective of a discriminant analysis is to develop discriminant functions, right or linear combinations of the predictor or independent variables which will best discriminate the categories or the criterion/dependent variables. Now, what you understand; it says that a linear combination of the independent variables is done in such a way that it best discriminates between the groups in the criterion variable, right. So, in the criterion variable, you had let us say, heavy and non-heavy, let us say non-heavy, so if I take the you know independent variables like age, let us say you know, sleeping habit, right etc., then how are they helping; how are these helping to differentiate between heavy and non-heavy, so can I create a equation; a discriminant function is like a equation which will help me by putting on the values of X1 and X2, right, to predict whether he will be a heavy person or a non-heavy person, right in weight. So, examination of whether discriminant significant differences exist among the groups in terms of the predictor variables, right, determination of which predictor variables; for example, in logistic regression, you have seen exponential beta right, the higher the exponential beta, we could say, higher is a contribution of that particular variable to the dependent variable, so if you remember in the last case where we have taken monthly salary and I think it was gender, right. So, we had seen that monthly salary at a higher exponential beta, so that explained more in comparison to the gender, okay to explain the effect on the dependent variable, so determination of which a predictor variable contributes to most of the inter group differences, classification of cases to one of the groups based on the values of the predictor variables and the another objective is to evaluate the accuracy of classification. This was same like in the logistic regression also, right, so discriminant function is the linear combination of independent variables developed by discriminant analysis that will best discriminates between the categories; the 2 or 3 categories or 4 categories whatever you have but you should not have too many categories also, then it will create confusion. In the 2 group case, it is possible to derive only one discriminant function. But in multiple discriminant analysis, more than one function may be computed okay, very interestingly as I had given you an example and I said it is very, it looks very similar to ANOVA, right, actually there is a relationship, discriminant analysis and MANOVA which has got multiple independent variables, right and each independent variables has got several factors, right. So, discriminant analysis actually is a reverse, is this called as the mirror image of the MANOVA that is very interesting, so discriminant analysis is a mirror image of the MANOVA, right, why? Because in the MANOVA, you had the dependent variable which was continuous and the independent variables were categorical, right, so you had more than one independent variables. But here interestingly, you have a categorical; one categorical dependent variable and then you have some independent variables which are continuous, so that is why it is a mirror image, right, so you see the MANOVA, the criterion is metric and the predictor is categorical right, however in discriminant, the criterion is categorical and the predictor is metric, right. It is the same assumptions as MANOVA, so multivariate normality, independence of observations or cases, homogeneity of group variance or co-variance, so that this case co-variance. So, right, all these are similar right, discriminant analysis permits a MANOVA hypothesis or the test that 2 or more groups differ significantly on a linear combination of the discriminant variables, right. What we say is that 2 or more groups differ significantly on a linear combination of the discriminant or the independent variables that you have taken, right, another way to understand is how well can the levels of the grouping variable be discriminated by scores on the discriminating variables? So, how are my predictor variables, how good will they explain the differences in the criterion variable of the dependent variable, okay, so look at this interesting table. So, the similarities between ANOVA, regression and discriminant; number of dependent variable in ANOVA, you had 1, right, in regression, you had; you also have 1 and discriminant also, you have 1, number of independent variable; multiple, multiple, multiple. Differences; nature of the dependent variable; metric in ANOVA, metric; continuous or metric right, in regression but here categorical. Nature of the independent variable; categorical in ANOVA, metric in regression and in discriminant metric, right, so this is the basically the relationship between and the difference, right, Now, how does it look like; the discriminant analysis model is defined as the statistical model on which discriminant analysis is based, right, so this is how the discriminant function looks like, the discriminant score where D = b0+ b1 X1 + b2 X2+......+ bk Xk So, what are these b’s let us say, this b’s; b1, b2, b2, they all my coefficients, so this is my intercepts, these are my coefficients; b1, b2, b3 and X is my predictor variables or the independent variables, the coefficient or weights are estimated, so the coefficients will only help me to as in the regression we were doing, this coefficients only will explain me or have an impact on the value of D, correct, higher the value of b, the larger the impact on D. So that the group differences differs; groups differ as much as possible on the values of the discriminant function, right, this occurs when the ratio of between group, sum of squares to within groups sum of squares for the discriminant scores at a maximum now, what does it mean? That if you remember, in ANOVA, what was an F ratio = So, it says similarly, the ratio of between groups sum of squares to within group sum of squares for the discriminant scores should be maximum, right, so any other linear combination of the predictors will result in a smaller ratio, so this is; if this value should be maximum then only it is the best combination, okay. Some key statistics associated with discriminant analysis; first is canonical correlation; you will see when I will use the spaces also, this I will show you, it measures the extent of association between the discriminant score and the groups, right. Centroid; it is the mean values for the discriminant scores for a particular group, right. Classification matrix; it contains a number of correctly classified and misclassified cases. Hit ratio; in the classification matrix, when I will show you the sum of the diagonal elements divided by the total number of cases represents the hit ratio. And it is a percentage of cases correctly classified, I will show you when I will show you the table but understand this terms are very important, the discriminant function coefficients are the; these are the unstandardized coefficients; are the multiplier of variables when the variables are in the original units of measurement, right. So, these are some of the terms that are associated. So, discriminant score; unstandardized coefficients are multiplied by the values of the variables, variables means they are X1, X2 whatever, this products are summed and added to the constant term which is the b0 or the intercept right, a to obtain the discriminant score, so it is exactly like a you calculate the regression equation and the score, right, eigenvalue. For each discriminant function, the eigenvalue is the ratio of the between group to the within groups sum of squares, right. So, higher the eigenvalue more is explanation, right, better it is. Standardised discriminant function coefficient; they are the discriminant function coefficients that are measured or that are used as the multipliers when the variables have been standardised to a mean of 0 and a variance of 1, so this means what; you have standardised it, so whenever you standardised a variable, so then only you can compare 2 different variables at the same level why? Because when you have standardised all the value lies between 0 and 1 and the mean has a 0 and a standard deviation is actually 1, SD = 1, so SD is 1 means variance is 1, right but the mean is 0, so this is how you know the standardised value is taken and it helps for comparison, right. Some other terms are like for example, structure correlation is referred to as the discriminant loadings, so this term loadings you will understand later on also, the this is the correlation between the; between 2 you know variables, the structure correlation represent the simple correlation between the predictor and the discriminant function, right, so and then another term called Wilk’s lambda (?), sometimes called as the U statistic. Wilk’s lambda (?) for each predictor is the ratio of the within group sum of squares to the total sum of squares not between, what it is saying; the within sum of square, so mean sum of square within or total sum of square within let us say total sum of square within divided by the total sum of square, so this Wilk’s lambda (?) is actually helping to measure the statistical significance of the model, okay. Mahalanobis procedure; a stepwise procedure used in discriminant analysis to maximise the distance between the 2 closest groups, Mahalanobis distance, you were also using in regression also to find outliers, if you remember right, so it helps to measure the distance between the 2 closest groups, right. Territorial map; a tool for assessing discriminant analysis results that plots the group membership of each case on a graph, which I generally do not use it but you can if you want to plot in a graph and see so that is through the territorial map, okay. So, what is the step involved? First you formulate the problem, estimate the different discriminant function coefficients, determine the significance, interpret the results, okay and assess the validity. Now, step 1; how do you do that so, the first step is to formulate the problem by identifying the objectives, criterion variable and independent variables, so what is your dependent, what is your variable, what is your independent variable and what is your objective of the study, right, first you need to formulate. So, the criterion variable must consists of 2 or more mutually exclusive and collectively exhaustive categories. So, there should be 2 categories which are mutually exclusive, very different and collectively exhaustive means, they all the respondents are lying in this two categories only, right, the sample may be divided into 2 parts, right, one part called the analysis sample is used for estimation of the discriminant function, the other part called the validation sample is used for validating the discriminant function. So, one part is used for estimation, the other part is used for this validation, so if you have let us say 200, so you can divided into 2 parts of 100, 100 each, one to estimate the discriminant function and the other is to validate the discriminant function, right. The role of both part samples are interchanged, then analysis is repeated, this is called cross validation which you will see later on, right. Second; how do I estimate the discriminant function coefficients; two broad approaches are there, right, one is the direct method and the other is the stepwise method, by now I think you have understood what is the stepwise method, in regression also, I will explained. The direct method involves estimating the discriminant function so that all the predictors are included simultaneously enter and in regression, you were using the method enter, okay. This method is appropriate when it is based on previous research or a theoretical model, okay. In stepwise method, the predictor variables are entered sequentially based on the ability to discriminate among the group, so you may use a forward method or something like in stepwise in the regression we are using step forward regression, backward regression, similar right, then determine the significance of the discriminant function. It would not be meaningful to interpret the analysis of the discriminant function estimated were if they were not statistically significant, so null hypothesis in the population, the means of all discriminant functions in all groups are equal and can be statistically tested, this test is based on the Wilk’s lambda which I already said, what it said the within the group variance divided by the total variance, right. If several functions are tested simultaneously like in multiple discriminant, the Wilk’s ? is a product of the univariate lambda for each function, so that you need not worry, if you are doing it by hand it is a separate thing but you will be do not do it by hand, we do it by SPSS, so it gives us or any software it helps us, the significance level is estimated based on a chi-square transformation, so this is something which is which will be provided the chi-square value which helps in predicting the overall model significance, right. If the null hypothesis is rejected, it would indicate significant discrimination then one can proceed to interpret the results. Let us see this at the fourth step is to interpret the results, right, so the value of the coefficient right for a particular predictor variable, so whatever the coefficients you have X1, X2, so this b1, b2, right, we are talking about this, right, so the value of the coefficient for a particular predictor variable predicted depends on the other predictors included in the discriminant function, so they are somewhere related, okay. The sign of the coefficients are arbitrary but they indicate which variable results values result in large and small function values and associated them with the particular groups, now what does it mean; that means in simple terms, you need to understand that the slope values, right with the you know, the variables together we help in discriminating, we will give you the discriminant score right. So, this discriminant score which will come, this will finally help you to discriminate whether this will come into the category 1 or category 2, right, this is what it means. The relative importance of the variable can be examined by the absolute magnitude of a standardised discriminant function coefficient, higher coefficient contribute more to the discriminating power; discriminant power. That means a higher coefficient is better in explaining that these 2 are clearly different right, if the coefficient values is less, smaller coefficient is there, then you cannot exactly say that there is a clear-cut difference between the two groups, right, some idea of importance can be obtained by examining the structure correlations also called as canonical loadings or discriminant loadings. Now, remember greater the correlation, the more important is the corresponding predictor, right, so let us see and finally we will come to the validity of the discriminant analysis. So, to check whether the study is valid or not what you should do is; the data at; divide the data into 2 sub samples; one, analysis and the other is for validation, right, so this will help you in estimation, right and this one for validation. So, the analysis is used for estimating the discriminant function and the validation sample is used for developing the classification matrix, now I will show you in regression logistic also, you had seen classification matrix which was helping you to calculate the hit ratio which was telling what percentage of the values were predicted correctly and thus how strong the model was. For example, in the last case, if I remember a predictor or model, they had classified 91.2 percent of the cases correctly in logistic regression if you remember. So, here that is what it helps you to find out the classification matrix which tells you whether the model is sufficiently explaining the whole process or not, the discriminant weight which is estimated by using the analysis sample right, so analysis sample are multiplied by the value of the predictor in the holdout samples, so this is the analysis, this is the validation or the holdout sample to generate the discriminant score for the cases in the holdout sample, right. So, one you are using for validation, the other you are using for estimation, if you do not do this also, it is not a very great deal, you can still you know, get to know but the point is if you validate it, it becomes more rigour, right. What is this hit ratio? Hit ratio is determined by summing the diagonal elements and then dividing by the total number of cases. For example, so let us say a, b, c, d right, so this was a case of yes, no, right and whatever was, so this plus this, right, this plus this divided by the total, right, the total number of cases, it is helpful to compare the percentage of cases correctly classified by discriminant analysis to the percentage that would obtained by chance. So, what it says is classification accuracy achieved by discriminant analysis should be at least 25% greater than that obtained by chance, it should be at least 25% greater. Now, how do you conduct the discriminant analysis? So, this example; move the dependent variable in the grouping variable, so let us go to the slide straight away. So, this is the model were I will show you, so there are two cases; variable view, if you go to the variable view, so status of people were successful and unsuccessful, right and what is this; the level of IQ they have, the level of guidance given to them and the income of this people, right, the family income or something. Now, we want to seek whether the IQ, guidance and income, does it predict the status of a person, whether he will be successful or not successful, right? So, we are interested there are 3 independent variables and one dependent variable which you can see, right, so there are only 2 cases; 0 and 1, so 0 is your case of successfu, 1 is unsuccessful, okay. Now, I want to run a discriminant equation, discriminant you know analysis. So, first you can check for normality of the data and if it is following normality, then you will automatically go for discriminant analysis. So, how do I go; go to classify because it is the classification technique, so discriminant, right, what is my grouping variable; status, so what is the range; minimum is 0, so minimum is 0 and the maximum is 1, okay continue. Now, what are independent variables; I am taking all the 3 guidance, sorry, okay now statistics. What do I need; I need the means, I think if you go by if you go the; you know ppt also it is everything is mentioned here. So, for example you see, move the independent variables, select the independent variables, click on statistics, go to the box M, right, so everything is given here, right, so let me show again, let me go back, okay, so I need the box M, right and I need the unstandardized weights, right and the within groups correlation, so I need this, if you are interested to save, you can predict the group membership which group does it come to; the successful group or unsuccessful group, right. And the probabilities, right, so this is all we need and I go to okay, so look at it if I; if what I am getting is; I have got N is 90 and all there is no missing case, right. Now, if you look at this group statistics, so successful people and unsuccessful people, their IQ level, the mean and standard deviation for IQ, guidance and income for successful people and unsuccessful is given to you, right and as you can see the IQ for successful people is more than the unsuccessful. The guidance is more again and income is also more, right, so can we now say which one out of this may be all are impacting positively but how much or which one is the most important one right, let us say. Now, if you look at the correlation, right, so correlation between IQ and guidance is .338, IQ and income .005, similarly between guidance and income is .053, so the strongest correlation we find is between IQ and guidance, okay. Now, let us go down, so if you see this box M test, now this is, this test the null hypothesis of equal population covariance, so the box M test should be not significant right, that means the value of the box M test, the significant value should be above .05 that means what; the null hypothesis says there is no difference in the group variances and the population or the covariance matrices; population covariance matrices. And this is coming true, you cannot reject this null hypothesis and this is what is required, right, so this should always come above .05, right now, look at this canonical correlation value, so this value if you see this is exactly similar to or very much similar to the R square value, so if I take this value, this is similar to the R and if I take the R square, so that means what; .727 square is equal to something around 50 something will come, right. So, this is my R square of the explained variance in the model, right when I am having, the 1 criterion and 3 predictor variables which we had in this model, right. Now, the Wilk’s lambda which is a test of significance is significance is significant this means what; it tells that the overall model is significant okay, now let us go to the file, so this values I have put it there, I will show you. So, suppose let us go back to the problem, a researcher wants to see the influence of IQ, guidance and income on the results status, success or failure, right of students. IV’s are IQ given; DV is given, success or failure, okay. Now, how do I report? So, discriminant analysis was used to conduct; let me show you the first this value, right, so eigenvalues is .727, so if I take this and square it, I am getting this value, right, 52.85, so this discriminant functions extracted accounted for nearly 53% of the variance in the student result status confirming the hypothesis, right. Now, if you look at this Wilk’s lambda, it is saying it is significant, right. So, here you see the overall chi square test was significant, Wilk’s lambda is .471, chi-square is 65.125, degree of freedom is 3 and this is what the canonical correlation and significant .001, so the overall model was significant, right, so the overall model was significant, this value you have to mention. Now, the third table was this box M test whether I mention it here or not about the box M, let me see? If it is not mention also, you need to mention, right, so from the box M test, we found that the null hypothesis that the difference among the groups, right, there is no difference among the groups exist was found to be true that is also you can report, okay. Now, coming to this standardised you know, canonical discriminant function coefficients now, if you look at this, right, this 2 table, this tells you the importance of each predictor variables to the criterion variable. Now, this has a different role, I will explain this, so what it is saying; income has .811, so income effects the success or a failure rate highest, the highest impact is of income, right, so next is IQ, IQ has a .442, right and third followed by guidance, so it is income, then guidance and then, right, income, guidance and IQ, okay. Now, what is this table saying, you will get this table, if you see, if you back to the output file. So, if you see this, so you have seen this table right now, right and we are talking about now this table, so this table is the table which I am requiring again here, so this table says, now I want to find a discriminant score, right so, my equation will be made on this, so you see, so D = -17.413+.056 IQ +.037 guidance +.015 income, so this is what the discriminant score you can this is how you calculate the discriminant score, right. So, by putting on the actual values of IQ, guidance and income now you can predict the discriminant score and accordingly you can see the cut of value is always .5, so .5 it is let us say it is above .5, then it is move towards 1, if it is below .5, it moves towards 0 because there can be only 2 value, right; 0 and 1. Now, look at this classification result, so this is the one which tells you the hit ratio. Now, what you do here; now you see 55 cases originally they have measured, the status is successful unsuccessful, this is observed, right, this is observed and this is predicted, so observed successful was let us say which was successful and they were actually predicted also were 55 cases, 7 people who are observed to be successful but you are predicted that there will not be successful is 7, right, so to this an error. You had predicted, you had seen that; you have seen that there are unsuccessful but you have predicted that there would be successful, so again 2 cases which are wrong, it is like a type 2 error, okay and this is where you have unsuccessful and this is actually unsuccessful, so it is again correct, right, so now you see how much; 90%, right, so that is what it says here, 90 percent of original cases were correctly classified, right, the cross validation is done only for those cases in the analysis, right only for those cases in analysis. In cross validation, each case is classified by the functions derived from all cases other than that case now, let us see in this case what is happening; 55 cross validation, , so 88.9% of the cross validated group cases correctly classified, so what basically it does; you need not bother about it, when you do this you know discriminant analysis on the software, it will help you to get this tables and once you understand this tables, you can even calculate the discriminant score and you can predict whether the model is a sufficiently explaining or it is not explaining, right. And thus after that you can report it, right, so once you do this, it helps you in very clearly discriminating between 2 or more than 2 groups, right, so and accordingly draw an inference in the study, right, so this technique has a very high you know utility and it can be utilised, so you can understood by now that logistic regression and discriminate analysis both are similar kind of techniques. The only difference lying in the data's behaviour whether it is a normal data or not, if it is normal, then discriminant analysis, if it is not normal not following a normality condition, then it is a logistic regression case, right. Well, this is all for the day, thank you very much.
Info
Channel: IIT Roorkee July 2018
Views: 5,112
Rating: 4.8904109 out of 5
Keywords: Prof. J. K. Nayak, Department of Management Studies, Indian Institute of Technology Roorkee, discriminant analysis, discriminant analysis and manova, discriminant analysis using spss, reporting the discriminant analysis result
Id: c6cvPPEE2pY
Channel Id: undefined
Length: 36min 49sec (2209 seconds)
Published: Thu Mar 28 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.