Mod-01 Lec-16 Multivariate Analysis of Variance (MANOVA)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Good morning. Today, we will discuss Multivariate Analysis of Variance, multivariate analysis of variance, which is popularly known as MANOVA. Today's contents are conceptual model for MANOVA, assumptions for MANOVA, modelling estimation of parameters, model adequacy tests, interpretation of results, and references. We will see that how much is possible to complete today and the remaining portion, we will be completing in the next class. You see this slide last class I had shown you that when l equal to 2 where l stands for number of population and p is 1 then we have used t-test. That is the difference between two population mean that is described through as H 0, mu 1equal to mu 2 and H 1 mu 1 not equal to mu 2. For the p greater than equal to two case that is the multivariate case here also this H 0, mu 1 equal to mu 2 and mu 1 not equal to mu 2, but this mu 1 and mu 2 are in the vector domain. When p equal to 2 that is the two cross one vector and we have used Hotelling's t-square. Last class I have told you that, when l greater than equal to 2 that is three or more for one variable case you will be using ANOVA. We have discussed one way ANOVA, two way ANOVA, three way ANOVA and multi way ANOVA concept. Now, if your number of population is more than two that is three or more and as well as number of variables are more than two or more, in that case what will happen? You will find out that you require to use MANOVA not ANOVA. So, essentially what we have discussed then? When we will go for ANOVA? ANOVA is you have l number of population l can be 1, 2 capital L. That is 1, 2, 3 like this, this is the lone 2 3 and you will collect ith number i is equal to 1, 2 dot n number of observations. You are interested to see the difference in population means, where population is determined by l 1, l 2 like one to L, that is the case everywhere .The mean value you will find out here the mean value, you will find out mean one we can say x 1 bar x 2 bar like this. Then finally, x bar that what we have seen last class Now, in case in case of ANOVA that l is greater than equal to 3, then what you mean to say that there is three or more population wit H 1 variable p equal to 1 you are going for ANOVA. When p l greater than equal to three, but p is two or more then you will go for MANOVA. Your hypothesis, what you will propose that we will see later on, but before that let us see one example. This is an example last class in ANOVA we have seen that process A, process B, process C producing steel washers with certain quality characteristics that is outer diameter. So, with H 1 quality variables, now here we are adding one more quality variable that is inner diameter. So, that means the steel washers produced by the three processes are measured in terms of their two quality characteristics that is outer diameter and inner diameter. So, here p equal to 2 and there are three processes process A process B and process C. So, our case example case is capital L equal to 3 and small p equal to 2. So, this a case for MANOVA, this is a case for MANOVA, what we have discussed here. I told you in last class also one of the useful way of looking into data is the box plot. If you see the box plot for the two different variables inner and outer diameter for the three different processes process A, process B, process C. You are seen in last class for outer diameter the mean differences are quite visible from the box plot. If you see the lower figure where it is the inner diameters are plotted in terms of box. So, you find out the mean value for process A, that mean the mean inner diameter is this one mean inner diameter for process B is this mean inner diameter for process C is this one. So, apparently if we want to say about the differences in means between the three processes are process A and B means are not different, but process C is different from process A and B. If you see the outer diameter case here process A and C's mean differences are not all significant may be, but process B it is different from both A and C. So, there are two types of pictures, now from outer diameter point of view you are saying that. Also, we have seen in ANOVA that process B is different than processes A and C, but here if we add inner diameter, what is happening here is that process C perhaps will differ in terms of their mean values of inner diameter from A and B. Now, we want to see collectively what is the difference in mean vector? Where this vector will be determined by mean diameter for outer diameter as well for mean inner diameter? So, collectively are the two process, three processes different or not that is what we will be testing through MANOVA. So, we require certain notations to fully understand the application of ANOVA as well as MANOVA. In case of MANOVA there is one more dimension is added as number of variables will be more than equal to 2. So p will be greater than equal two, so as a result you see in ANOVA what we said we said like this that ANOVA case here it is number of populations this axis. This axis is observations and for different population you have obtained certain observations. Somewhere, there is one observation, which is X i l ith observation on the lth population and it is a scalar quantity. Now, in case of MANOVA what will happen? You will find out that one this side is number of observation. Let it be population in the same manner number of populations and this side is observation, that is number of observations. Then we will we will add one more dimension, which is number of variables. So, if we denote number of population in terms of l number of observation in terms of i and number of variables in terms of j. Then what will happen? Your general structure will be like this. This is the data structure basically getting me, what we learn from any observation. Suppose, if I say this is my X i l getting me, this is our X i l then X i l is no longer a scalar quantity in ANOVA. This is a scalar quantity in MANOVA, it is not a scalar quantity, because for this cell you see if i go one two three depending on the number of variables X i l will have more number of values. So, we can say here that X i l is no longer a scalar quantity, it is a vector. What we will write here X i l 1, X i l 2, so like this X i l how many variables are there p variables are there p cross 1. There is the complexity, because one more dimension is added. Your total work is now in three dimensional case, it is not a two dimensional issue. If this the case, so this is the general observation this is my general observation as you have p variables. So, you also have p mean values, now I am writing that the mean vector for population l there will be p mean values mu l 1 mu 2 like. This mu l p p cross 1 and as there are p variables again there will be one covariance matrix. If I write like the s l sigma l instead of s, we will be writing when we take the sample. Now, we will be writing like this. Then what will happen? This will also be p cross p matrix p variables are there your l varies from 1to capital L. So, then there will be sigma 1, sigma 2 like this sigma capital L. So, there will be similarly mu 1, mu 2 mu capital L for mathematical simplicity we will be using this term. Although, this a vector basically we will be using this term like this without bringing the variable part the variable part is implicit here, because X i l is nothing but this one this is the general observation. So, you see this pictorially given in this figure I think it is very clear. Now, for all you that ANOVA where all we have different populations and different observations. For MANOVA, it is not population observation and variables and this general observation is X i l, which is this one you have to keep in mind this vector part. Then MANOVA assume something and MANOVA do also hypothesis testing like your ANOVA. What is the hypothesis here? What are the hypothesis in ANOVA? You said mu 1 equal to mu 2 equal to mu l that is your H 0 and your H 1 is mu l not equal to mu m. For at least one pair of mu l, pair of mu means either mu l or mu m l equal to 1 2 capital L m equal to 1, 2 capital L and l not equal to m. In MANOVA case the same hypothesis is same, that we are saying H 0 mu 1 equal to mu 2 equal to mu l. Please, keep in mind by saying this we are saying like this mu 1 1, mu 1 2 like this mu 1 p equal to mu 2 1 mu 2 2 mu 2 p equal to. Finally, mu capital L 1, capital L 2, capital L p that is the difference in ANOVA case, it is scalar quantity. MANOVA case it is the vector quantity. Your alternate hypothesis that mu l not equal to mu m by saying this you are saying that mu l 1 mu l 2 mu l p this not equal to mu m 1 mu m 2 for at least one pair of l m. So, like ANOVA we will also partition the general observation. So, what is my general observation here, my general observation here is X i l. Let us see in ANOVA case, parallely see MANOVA, case in ANOVA case X i l is partitioned like this, mu plus mu l minus mu plus X i l minus mu l, this is the way you partition. We say this is equal to mu plus tau l plus epsilon i l mu is grand mean tau l is the population effect. We are saying that no model is perfect, it cannot predict exactly same exact value for all the observations, there will be random errors. So, sigma epsilon i l is the random error part. So, same partitioning possible in MANOVA, what we will write X i l equal to mu plus mu l minus mu plus X i l minus mu l? So, it similar it is like this we have seen that X i l is X i l 1 X i l 2 X i l p. p variables are there, which is equivalent to mu 1 mu 2 like our case is mu l that is the grand mean case. If we write like this think earlier we have written mu l 1 minus mu 1 mu l 2 minus mu 2 in the same manner you come mu, I think l this mu 1. Now, mu l minus mu, so all cases l will be there. Then every case what is, what we are doing? This is mu 1 mu this is mu p not mu l this is mu p, please write down this is mu p. So, the mu l p minus mu p, that is why the problem comes, I have written l then plus same thing X i l all this you write X i l 1 minus mu l 1 X i l 2 minus mu 2, same manner you come X i l p minus mu l p. This is what is partitioning the general observation, to three components one is this one is the general observation to the left had side general observation vector and right hand side. This is your grand mean vector then other one that mu l 1 minus mu 1. This one we are saying population effect vector population effect vector and last one is random error vector. This partitioning for this one this partitioning we can write also in the same manner earlier we have written that this first one. This is X i l fine the second one is mu that is also fine then third, it will be tau l plus epsilon i l. So, this is my general observation, this my grand mean vector, this is the population effect vector, this is the random error effect. So, if this is just we will just what I mean to say that let us write down the tau l case. This will be a vector tau l 2 tau l 2 like this tau l p, p cross one vector. So, when we say lth population effect, that is related to all the variables considered here we are considering p variables. So, tau l for p variables. So, if we frame our hypothesis like this that H 0 mu 1 equal to mu 2equal to mu l and H1 mu l not equal to mu m for one pair of lm. Then using this tau l concept, you can find out the null hypothesis also, what will be the null hypothesis also, what is tau l? Tau l is mu l minus mu. If H 0 is true of this one is true means all means are equal, then what will be the grand mean? Grand mean will me n 1, mu 1 plus n 2, mu 2 plus dot dot dot n l mu l divided by n 1 plus n 2 plus n l. If all means are equal then what will happen we can write all mu 1 equal to mu 2. So, it will n 1 plus n 2 dot dot dot plus n l by n 1 plus n 2 plus n l into mu because this will be cancelled out. So, what we mean then we want to say that every mu will be equal to the grand mu getting me. So, then what we can say that mu l equal to mu if H 0 is true, which indicates tau l equal to mu l minus mu equal to 0. That means we can create null hypothesis like this tau l equal to zero l equal one, two capital L and your alternate hypothesis will be tau l not equal to zero for at least one l. So, if you test one hypothesis in terms of mu and other hypothesis in terms of tau l you are actually doing the same thing. So, in MANOVA we will do this in the same manner like ANOVA partitioning. Now, we partitioned the observation, now we will see that how to partition the this one your variability part, but what are the parameters you are estimating in MANOVA your parameters will be this tau l as well mu l. Also, you have to estimate mu you have to estimate and you have to estimate also the error terms and another issue here is if you go for unequal sample size, then the weighted effect of the population that sum will become zero. If you go for equal size that is again, the sum of the effects of the populations will become zero. I think you can prove the second one also first one, why it is zero. What I mean to say, we are saying that we are saying that some total of n l tau l l equal 1 to capital L. This is zero you see, what is tau l? tau l is mu l minus mu. So, if you write down here sum total of l equal to 1 to capital L n l into mu l minus mu the l h s that is the left hand side, what you will get here? This one l equal to 1 to capital L n l and mu l minus l equal to 1 to capital L into mu. So, you have already seen that mu equal to n 1 mu 1 plus n 2 mu 2 plus n l mu l by this. So, that means the sum of this n l mu l can be written like this can be written as this one this into this. Then that will be cancelled out and you will become it will be here. It is n l is there you have not written this, so it is Okay. So, what we mean to say that this quantity is n 1 plus n 2 plus n l into mu, yes or no? I have told you this, that n 1 plus n 2 plus n l mu equal to n 1 mu 1 plus n 2 mu 2 plus n l mu l. So, again this none mu 1 plus n 2 mu 2 plus n l mu l is this one n 1 mu 1 plus n 2 mu 2. This is this part minus n 1 plus n 2 plus n l mu, this quantity equal to this quantity problem. You have to understand that the grand mean is mean of the means weighted case here, n 1 mu 1 plus n 2 mu 2 plus n l mu l by total frequency and this one is mu. So, you can write from here n 1 mu 1 plus n 2 mu 2 plus n l mu l equal to n 1 plus n 2 plus n l into mu. Then if you make minus this will become zero, this is the case. Now, we will see the assumptions. What are the assumptions? Population covariances are equal, errors are normally distributed, errors are i i d. We will see fast population covariances are equal, how to test it? We will be using box M test. Here we will create hypothesis H 0 that population covariance are equal and alternative hypothesis will be sigma l not equal to sigma m, for at least one pair of l m. Now, we create one statistic, this is our hypothesis then creates the statistic. Our statistic we are creating suppose D equal to 1 minus u into M. So, you require to know, what is M? and what is your u? Now, let us see the slide where I have written these things in slide you see that M is minus 2 log l equal to one to L capital L. That is the multiplication into that determinant of S l by S pooled to the power n l minus 1 by 2. This is what is actually the this ratio S l by S pooled all of you know S pooled, what is S pooled? how to come to S pooled? Your n 1 minus 1 S 1 plus n 2 minus one S 2 plus n l minus one S l divided by divided by what? n 1 plus n 2 plus n l minus. Correct? So, that you have see earlier also in two variable, sorry two population univariate case. You have seen that n 1 minus 1 into S 1 plus n to minus 1 into by n 1 plus n 2 minus 2 just check, this is the case. Now, again you see the formula, if your sigma 1 equal to sigma 2 equal to sigma l, then your S pooled will be equal to S 1 or S l in general term. So, that means this determinant by this and determinant that will very that will be ratio will be one and why have taken log? The log is taking to make it linear? Because, it is a multiplicative one to make it a linear one the log is taken here like this. So, if you have taken minus two log it is coming like this. This quantity is linearized like this summation of l n l minus 1 log S pooled minus l n l minus 1 this our m value. So, this m value and what is the u value? u is summation of l equal to 1 to capital L 1 by n l minus 1 minus 1 by that. This sum into two p square plus three p minus 1 by 6 into p plus 1 into l minus 1 this the development by box. So, if you then you put m and u in D, now D follows chi square distribution with nu degrees of freedom, where nu is 1 by 2 p into p plus 1 into l minus 1. So, you have to remember this, what is your nu value nu equal to half p into p plus 1 into l minus 1, that is the degrees of freedom for D. So, your if your D greater than equal to chi square alpha and mu, then you reject H 0 population variances are not equal. We have calculated this for this data set. Are you not comfortable? Now, to compute the covariance matrix for a given data set for process A the covariance matrix is S 1, for process B it is S 2, for process C it is S 3 you will be using. Can you recall the covariance matrix formula, what you have used if your X is n cross p? Your X bar is p cross 1 you have created X minus X bar you also multiplied by 1. So, to make it n cross one, I think you have done like this one, this transpose, then this one transpose X minus 1. X bar transpose this will be n minus 1 into S same formula we have used here. We found out S 1 for this, this row this column and this column this two column O D column for process A I D column. For process A is S 1 you are getting using same formula you get S 2 you get S 3. Then what you require to know, you require to know S pooled, S pooled will be S 1 plus S 2 plus S 3 divided by because here n 1 equal to n 2 equal to n 3 equal to n equal to 10. So, my S pooled will be n 1 minus 1, that means 10 minus 1 into S 1 10 minus 1 into S 2 10 minus 1 into S 3 by n 1 is what, 10 plus 10 plus 10 minus 3. So, it is 9 by 27 into S 1 plus S 2 plus S 3, so 1 by 3 S 1 plus S 2 plus S 3. Now, you see any one of the value, suppose I want to know 1.29, here how 1.29 is coming? 1.29 the corresponding values in S 1 is 1.51 in S 2 is 1.43 in S 3 is 0.93. So, you sum 1.51 plus 1.43 plus 0.93 divided by 3 will give you this value, that you have calculated earlier also. Then using the formula for M that big formula you have calculated M value that is 1.04 u equal to 0.11. Then D equal to 1 minus u into M, that is 0.93, what is your degree of freedom in this case for D? I say that degrees of freedom are your Mu. Mu is half p into p plus 1 into l minus 1. So, mu equal to half p into p plus 1 into l minus 1, so half p equal to 2 into 3 into what 3 minus 1. So, how that mean 3 equal to 2 equal to 6? So your degree of freedom for D is 6, now chi square 6 with alpha 0.05 that value is 12.59 you will be getting it from chi square table. Then you compare computed D value versus chi square tabulated value. Now, D value is 0.93, which is much much less than 5., 12.59. So, we can say we are fail to reject null hypothesis. We are accepting null hypothesis that means the population covariances are equal. So, we have we have seen the equality of population covariances are satisfied. If this is satisfied, then we will go for MANOVA. So, this decomposition is simple, again it is not that tough, what the slide looks like is very difficult one, but it is not like this what we have seen in ANOVA. We say any observation when you collected X i l that one is partitioned into, now again let me see from the population point of view I say X i l equal to mu plus mu l minus mu plus X i l minus mu l. Correct? Now, what is the estimate of mu? That is X bar what is the estimate of mu l that you have seen in MANOVA. That is X l bar mu l, that is mean of the population estimate is sample mean. Now, we are partitioning the sample observation X i l, which can be written like this X l X bar plus X l bar minus X bar plus X i l minus X l bar. That we have seen earlier same thing possible here in MANOVA. This is from ANOVA you have done. Now, from MANOVA you do MANOVA also we have seen that this vector is mu vector plus mu l minus mu vector plus X i l minus mu l vector. That is the formulation and the estimates also will be like this. So, we are writing a vector X i l, which is X bar that is the sample mean vector plus X l bar minus X bar plus X i l minus X l bar. So, you can write. So, you have seen this one earlier, but it is what will happen is this one is p cross one equal to this will also be a p cross 1. This difference p cross 1 plus this difference, this is general partitioning of the sample observation you do little more manipulation. Here what you will do? Now, we will write like this X i l minus X bar equal to X l bar minus X bar plus X i l minus X l bar. If you take square, what will happen? Yes, transpose because, this is the vector form. So, you require to make like this X i l minus X bar into X i l minus X bar transpose equal to X l bar minus X bar plus X i l minus X l bar into its transpose. Correct? So, our X i l minus X bar is a p cross 1 matrix transpose will be a 1 cross p matrix and the resultant will be p cross p matrix, that is what we want also. Now, how many how many dimensions you have consider? One is i another one is l and other one is j l equal to 1 2 capital L i equal to 1 2 n or n l you write unequal sample size we will consider here and j equal to 1 to p. So, we will make sum over this dimension first is with i, so if I make summation i equal to 1 to n l, then this quantity will become X i l minus X bar into X i l minus X bar transpose. This will be if you multiplied this into this, this into this like this. So, I am multiplying that also, but I am writing first i equal to 1 to n l then you multiply. So, X l bar minus X bar into X l bar minus X bar transpose. So, this into this plus I can write i equal to 1 to n l X l bar minus X bar into this one, X i l minus X l bar transpose. So, first one to second one here plus sum total of i equal to 1 to n l going to the second one X i l minus X l bar into X l bar minus X bar transpose plus sum total i equal to 1 to n l X i l minus X l bar into X i l minus X l bar transpose. See this one, second one X l bar minus X bar into this X i l minus X l bar. The third one X i l minus X l bar X l bar minus X bar, so this value X l bar minus X bar is independent of i. Similarly, here X l bar minus X bar is independent of i. So, that mean i summation 1 to l will be affected here X i l minus X l bar as well as here. If anyone you take i equal to 1 to n l X l bar is nothing but n X l bar I am repeating i equal to 1 to n l X l bar is nothing but n X l bar means. What I mean to say here I am saying X i l i equal to 1 to n l equal to n X l bar. Now, again summation of i equal to 1 to n l X l bar if this also n l X l bar. So, this will become because this is independent of i. So, this quantity becomes 0, similarly this quantity with this becomes 0. So, the two middle terms will be deleted, because they are 0. So, then resultant equation will be like this i equal to 1 to n l X i l minus X bar into X i l minus X bar transpose equal to i equal to 1 to n l X l bar minus X bar X l bar minus X bar transpose plus sum total i equal to 1 to n l X i l minus X i l bar X l bar into X i l minus X l bar transpose. Correct? Now, this quantity can be further written like this. See here there in no ith term, so you can straight away write n l into X l bar minus X bar X l bar minus X bar transpose plus this ith term is available here, n l X i l minus X l bar and X i l minus X l bar transpose. So, we have taken sum over i, now we take sum over l. So, l equal to 1 to capital L then here it will be here also it will l equal to 1 to capital L here it will be l equal to 1 to capital L. So, l equal to 1 to capital L then here l equal to 1 to capital L. So, do we require the summation over p again? We do not require, because we are doing everything in the matrix domain and the vector quantity has taken care of the number of variables. So, we do not require further sum, so what is this quantity. Now, left hand side quantity this is if you consider X i l and X bar as a scalar quantity. Then this one is a square quantity and this square quantity. From all the observations point of view what you have seen in ANOVA. So, that you have seen in ANOVA, this one is S S T sum square total, but here it is a vector quantity. When you are multiplying this vector with its transpose in such a manner, it is creating a matrix not a scalar creating a matrix of p cross p dimension. So, we will write this as this one will be something like this p cross p here will be one p cross p plus this also will become another p cross p. Correct? So, diagonal elements will be the variance part off diagonal will be the covariance part variability and covariability. So, this one is S S C P total, this S S C P what is this that between population mean vector to the grand mean vector. So, we will write that is between then this one is error S S C P error. So, the total covariance matrix it is not actually the covariance, covariance that will be divided by the degrees of freedom. So, we can write that total sum square product matrix is divided into two sources of variability, one is the population other one is the errors. So, total sum square cross product is equal to that between sum square cross product that error sum square cross product. This is the difference from ANOVA big difference from ANOVA. In ANOVA you will be getting scalar quantity everywhere. Then what will be the degrees of freedom for this one? It is N minus 1 equal to L minus 1 plus difference N minus L same thing what you have done in ANOVA. So, when N equal to what sum of l equal to 1 to capital L n l that is all the observations together. So, in ANOVA we partition S S T into S S B and S S E, in MANOVA we partition the sum square cross product matrix of the total to between population and error. Correct? So, when you require to calculate S S C P T, S S C P B and S S C P E it is really difficult. Let us say that in terms of matrix transpose then one sum by second sum like this. So, for computation point of view this one S S C P B is little easier than the other two. So, first you compute S S C P B using this formula absolutely no problem. S S C P B computation will be like this, l equal to 1 to L n l X l bar minus X bar and X l bar minus X bar transpose. Correct? Then for S S C P E there is a formula, which is n 1 minus one S 1 plus n 2 minus one S 2 like this n l minus 1 S L, you have seen in pooled covariance case this was divided my degree of freedom, but it is not a covariance on it is basically S S C P matrix. So, that degrees of freedom is not divided, so it is S 1 is to S L all you can compute very easily. So, S S C P E will be computed S S C P B will also be computed formula. Then you compute S S C P T, that is S S C P B plus S S C P E this is the these are the steps, basically first you compute this, compute this, then compute this. So, this is what is our decomposition of covariance matrices decomposition of I can say instead of covariance matrix. Although, it is basically the same way covariance matrix will come ultimately, but it is sum square and cross product matrix. You write, I am writing S S C P matrix that is better, so S S C P matrix total to this two quantity. So, I think today we will stop here and next class I will show you the MANOVA table. Then all the tests, how to go for hypothesis testing? Then comparison, pair-wise comparison and other things. Thank you very much.
Info
Channel: nptelhrd
Views: 21,363
Rating: undefined out of 5
Keywords: Multivariate Analysis of Variance (MANOVA)
Id: lPqKhLF4s-A
Channel Id: undefined
Length: 59min 27sec (3567 seconds)
Published: Fri May 09 2014
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.