Conducting a MANOVA in SPSS with Assumption Testing

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello this is dr. grande welcome to my video on conducting a manova in spss so a manova allows us to look for differences across several dependent variables all right so in this example i have here and these are fictitious data we have an ID variable we have an independent variable and there's three levels to this independent variable and let's say that this research project these data are collected from looking at a substance use facility let's say specifically treating alcohol use disorder so they're going to have treatment as usual which is one of the levels of this independent variable program but you want to add an individual treatment some sort of specialized individual treatment or group treatment so you have individual group and treatment as usual but notice we have two dependent variables one is functioning so on for this scale let's say the lower value indicates higher functioning and then we have quantity and this would be the quantity of alcohol consumed safe during a period of a month and of course a higher quantity would not be what we'd want that would indicate less successful outcome so what manova does is it creates a linear combination of these two in this case these two dependent variables although you can have more than two and a test to see if there's any differences across the levels of the independent variable on that linear combination of the two dependent variables so it lets you detect differences across the levels of the independent variable on that linear combination a dependent variable but until you look in further you won't know where those differences are so it's a linear combination it's not one variable or the other where you're going to see the difference it's the combination of the two so there are some follow-up tests as part of the manova procedure and in those you can see was the difference on the functioning dependent variable or on the quantity or was it on both or was there no difference did all three treatments work equally as well on the linear combination of the two dependent variables and on each dependent variable separately so before we get started and running a manova i want to go over the assumptions i'm going to show you how to test for the assumptions of anova and there are a few assumptions for this particular statistic you do need two or more dependent variables and they have to be measure the interval or ratio level SPSS refers to those levels to collectively add scale so if you look here in the variable view there is no choice for interval or ratio you just have scale so that means it's either interval or ratio let's assume in this case that functioning is interval and quantity would be ratio but again there's no way to distinguish between the two and SPSS they're both treated the same you need to have at least one independent variable you can have more than one and that independent variable must have two or more categorical groups and you can see we meet that assumption with program we have individual therapy group therapy and the treatment as usual the observations need to be independent and you need to make sure you have a sufficient sample size and specifically you need to have a minimum number of samples for each level of the independent variable now there's different opinions about what the minimum should be one common minimum used is 20 so you need 20 sets of scores on these dependent variables for each level of the independent variable as you can see in in my case I only have 15 in each group because I only have 45 total another opinion is that the quantity for any one level of the independent variable needs to be greater than the number of independent the number of levels of the independent variable multiplied by the number of dependent variables so if we have three levels of this independent variable we have two dependent variables that would be six so if we're looking at that way we do meet the assumption of sufficient sample size by level of the independent variable manova is assumed multivariate normality they are sensitive to outliers there needs to be a linear relationship between each pair of the dependent variables across each level of the independent variables this test also needs homogeneity of covariance matrices but that assumption is tested as part of manova and the dependent variables in manova cannot be multi collinear so let's first test for outliers we're going to do that with a linear regression in this case it's going to analyze regression linear and for this test the independent variable in the manova is actually the dependent variable because we're testing an assumption here this isn't the manova this is a linear regression so the independent variable will become the dependent for this test and both quantity and functioning which is which are the dependent variables will become independent variables here and everything here will stay as default except for under the SAV I'm going to create a new variable and what I want is the Mahalanobis distance right so it's this box here Mahalanobis and this will create a new variable so I'm going to click continue and then okay and you can see that it conducted a regression here but we're what we're interested in is at the end under residual statistics we want to have a look at the maximum value for the Mahalo bonus distance and that is five point six three four so now we we have this value here five point six three four and what we need to consider is the maximum allowable critical value for Mahalanobis distance and there our critical value tables available for this but for a manova with two dependent variables the maximum would be thirteen point eight two that would be the maximum value anything above that would be an outlier for three dependent variables it's sixteen point two seven so here we have two dependent variables so we know it's thirteen point eight two and we can see that we do not have any values that exceed thirteen point eight two now say this maximum value did exceed thirteen point eight two if you want to identify the outliers quickly you would go to data and sort cases and you would sort the Mahalanobis distance and you'd sort it descending right so you'd bring the larger values to the top so if I soared this you can see that there is our maximum value and at the bottom is our minimum value and of course to return that back to the way it was you just go to sort cases in this case I have an ID variable which I try to always have an ID variable and you would move that into sort and you would leave it as ascending and now your data sets back to the way it was so now this is tested we could clear this variable but I'm just going to leave it up and now we're going to build a scatter plot so I'm going to go to graphs legacy dialogues all the way down to scatter dot and I want the matrix scatter here and that's going to I have three choices one is defined so I'm going to fine you can see I've already populated these variables I'll show you I did here it's functioning and quantity 2 matrix variables and program to the row list box and I will make a change under options I'm going to exclude cases variable by variable there are no missing cases in these fictitious data so you'll have to worry about it here but if you are missing values you do want to exclude cases variable by variable for this analysis so click OK and what we're looking for here is an elliptical pattern generally moving from the bottom left to the top right and as you look at these six different plots you can see that is the general pattern so we're pretty good here if you see kind of a square then you're then you're violating the assumption of that linear relationship but here there does appear to be a linear relationship for all six of the plots another assumption is the multivariate normality and the way that's often tested is by looking at the normality for each of the dependent variables now that is a different thing than multivariate normality but that's how we typically test it so we'll go to analyze descriptive statistics explore and you can see I have this populated so I'm going to reset this so it's just going to be functioning and quantity over to the dependent list and then under plots we're going to want to check off normality plots with tests and I don't really use the stem-and-leaf too much but it will check off histogram so let's run it this way and take a look at our results we're most interested in here would be the value the significance level of Shapiro Wilk so if it's greater than point zero five we're going to assume normality and if it's less than 0.05 or equal to point zero five we're going to assume that we have a non normal distribution that variable so we can see that we are violating an assumption here potentially again this is multivariate normality that we're supposed to be testing and I'm testing univariate normality but we can see that we have to reject the null hypothesis here on quantity so we do not have a normally distributed variable there and we're going to assume we don't and then functioning we do so it's now moved to the assumption no multicollinearity so the way we test this is through correlation so we're going to use bivariate correlation and again you can see I've populated this I'll reset it I'm just functioning and quantity get moved over to the variables list box under correlation coefficients the default is Pearson I'm going to leave it set there and there are no other changes so I'll click OK and you can see we have a correlation between these two variables the two dependent variables functioning in quantity of 0.3 3 9 so the Assumption I'm testing is that we don't have multicollinearity and that's defined in a few different ways one cutoff is 0.9 so if the Pearson's R value is greater than 0.9 then the variables are multi collinear another cutoff is 0.8 but we also have to make sure that there is enough of a relationship between the two variables so it's not just the absence of multicollinearity that we're looking at here we also want to make sure that there is a relationship that meets a certain level and generally we say above 0.2 so here 0.33 9 this is acceptable so we've met that assumption so with our assumptions tested let's move to the manova and that will be analyzed the general linear model and multivariate i'm going to reset this so here the dependent variables remember this manova handles multiple dependent variables that will be functioning and quantity and the fixed factor the independent variable will be the program note that there's also the ability to add a covariant if you were to include a covariant that would make this analysis a man Cova instead of a manova so looking at our buttons here to the right the model is going to say the same contrast just to show you how this works I'm going to add a simple contrast and the you can choose the reference category you have a choice last or first so if you're looking at analysis like this we're really comparing individual therapy and group therapy to treatment as usual so it make sense for the reference category here to be the last leveled independent variable which is treatment as usual so I'm going to leave that as last and make sure here you click change so you can see programs simple plots I'm not going to change post-hoc now we don't know if we're going to get a significant value and anytime we enter into running manova but in case we do we're going to set this up for post hoc test of program and because all my group sizes are equal I'm going to use two key if I had unequal group sizes I would use chef a so I'm just checking two key after moving a program over to the post hoc test for list box and clicking continue under save there'll be no changes and under options I'm going to move program over to display means for I do want homogeneity tests descriptive statistics effect size and observed power click continue so now this dialog is configured for manova I'll click OK and you can see here are three levels of the independent variable individual group and treatment as usual and as I mentioned the sample sizes are the same for each group they're equal 15 and then taking a look at the descriptive statistics you can see that for functioning it does appear that the mean is a bit lower here for individual and it's higher for group and then even higher for treatment as usual and then for quantity of alcohol consumed during the month we can see that individual again is the lowest group is lower than treatment as usual but not quite as low as individual and of course treatment as usual is the highest and of course remember this the analysis is based on fictitious data I'm so moving to boxes test of equality of covariance matrices we have a significance here of 0.78 1 so we have met this assumption because any significance value here greater than 0.001 would meet the assumption so no problems there with the boxes M then we'll move down to multivariate tests so as you can see here there are 4 different values that you can interpret and we're only going to interpret one and a common way to select 2 these we're going to use is based on whether we have met the assumptions from ANOVA or not so all the assumptions have been met and in this case I would say that we did not meet all the assumptions because there's some uncertainty regarding normality we would select Wilk so if we have met all the assumptions we select Wilks lambda if we have not it's placed race so in this case I'm going to go with placed race because of that Shapiro Wilk result and you can see here it word point 0 to 6 so we have statistical significance even using placed race so what we're saying here is that there is a statistically significant difference across the levels of the independent variable on a linear combination of the dependent variables we don't know where that difference is we can't speak yet as to whether it's on functioning quantity both or neither but we know that across the linear combination it was using a linear combination of dependent variables we do have a statistically significant difference between the levels of the independent variable which were individual and group counseling and then treatment as usual so those groups are different on that linear combination so it's let's take a look little further down get some more detail now we do want to interpret Levine's test of equality of variances and in this case we have a non significant result we're functioning and for quantity so we're good we would worry if we were at point zero five or below but with these significance values were in good shape no cause for concern and now we have tests between subjects effects all right so we want to look at program so don't worry about corrected model intercept on a start here program and first we'll look at function we see there is a statistically significant value here Oh point zero zero five one this dependent variable functioning and there's not a statistically significant finding on the variable quantity so moving down a little more we get to the contrast remember I ran the simple contrast with the reference categories treatment as usual which you see it indicates here the value three which is the value assigned to treatment as usual and you can see between level one and level three we have a statistically significant difference on functioning not on quantity and between level two which would be group and treatment as usual non significant result on both so we have the multivariate test results again here I will point out which I didn't note before the partial a two squared so if you move up you see here it's point one two two you see it was the same here that indicates that 12.2% of the variance in the dependent variables can be explained by in this case the treatment the independent variable in this case is the treatment so moving down to univariate test results we can see that just looking at functioning we do have a significant acquitting of it finding and we don't with quantity and you can see qui is explaining very little movement very little variance in the dependent variables 6.4 percent whereas functioning is explaining twenty two point four percent of the variance in the dependent variables again we can see the means here and we will interpret the post hoc tests because we had a statistically significant finding on place trace and similarly because that we were able to interpret these univariate test results so looking here the post hoc test this was a two key test you can see individual to group and so the differences between the individual level and the group level of treatment non significant point zero five nine the difference between the individual and treatment as usual levels of the independent variable was statistically significant point zero zero four so there's a statistically significant difference between the individual level and the treatment is usual level on the functioning dependent variable and then looking at group and treatment as usual we have a non significant finding so then looking at quantity which is the other dependent variable we have a non significant fine to individual group non significant to individual and treatment as usual and non-significant between group and treatment as usual so kind of an overall interpretation in terms of the analysis the manova that we ran well we have some we have a statistically significant just between individual and treatment as usual but no other statistically significant findings so it does appear that the individual therapy is more effective than treatment as usual but only for one of the two dependent variables only for the functioning i hope you found this video on conducting a manova and spss to be helpful as always if you have any questions or concerns feel free to contact me i'll be happy to assist you

Info

Channel: Dr. Todd Grande

Views: 63,617

Rating: 4.9256506 out of 5

Keywords: SPSS, MANOVA, MANCOVA, Levene’s Test, Homogeneity of Variances, Homogeneity of variance test, Tukey, Post Hoc, Scheffe, mahalanobis, distance, independent variables, dependent variables, Pillai’s Trace, Wilk’s Lambda, Box’s M, covariance matrices, outliers, multicollinear, Multicollinearity, counseling, Grande, Multivariate Analysis Of Variance, Statistics (Field Of Study)

Id: rCgeWeXRtDs

Channel Id: undefined

Length: 25min 11sec (1511 seconds)

Published: Mon Aug 24 2015