Oneway ANOVA SPSS program and interpretation

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
the purpose of this video is to illustrate the application of one-way analysis of variance using SPSS we're going to be using the teens survey data set this data set is provided through SPSS as part of its package the analysis that we're going to be doing to begin with is a one-way ANOVA looking at whether or not there are significant differences in vocabulary scores according to social economic status level the null hypothesis is that the populations for the three groups of social economic status low middle and high social economics share a common population vocabulary mean that is there is no significant difference in mean vocabulary score for the three populations of low middle and high socioeconomic status the first thing we have to do when conducting one-way ANOVA is to make sure there is sufficient sample sizes sufficient cell sizes in each level of our independent variable so we need to make sure that there are a fair number of cases in low middle and high social economics to do that we go over to analyze click on descriptive statistics over to frequencies and we load in socioeconomic status that's really all it takes we're just wanting to make sure that that there's enough subjects in each cell that we don't have to do any collapsing in order to run the analysis what we see in this case is we have 300 subjects and we have a minimum of 58 subjects in each one of our ourselves so while our while our groups are not balanced that is we don't have an equal number of subjects in each one of our cells we do have at least 10 subjects in each cell so that we're confident that we can at least get a valid analysis of variance to run in this case so we're going to go back we're going to look at our data set again and we're going to see to what extent our one-way ANOVA reveals whether or not there's a significant difference to run the one-way ANOVA click on analyze you go to general linear model click over to univariate you send scaled vocabulary score to the dependent variable and you send socioeconomic status to the fixed factor these are essentially the independent variables you can have multiple variables but since we are running a one-way ANOVA that means there's one independent variable one fixed factor in this case we don't have any random factors and we don't have any covariance these issues will be discussed in later videos when we have a one-way ANOVA there is no need for us to specify any model there's no need to specify plots we're not doing planned contrast in this case but we are interested in if we have a significant difference determining where those significant differences occurred so we are going to go to post hoc and we're going to check in this case s and K which stands for student neumann cools we click on continue we're going to run that for this factor and the factor in this case is SES which is our only independent variable so you're just going to send factor from here over the post hoc we're going to click continue options we want marginal means displayed for everything which is in this case is only our SCS variable we want to see descriptive statistics we want to see estimates of effect size we want to see what the power is and we want a homogeneity test done to make sure that we have not violated the homogeneity of variance assumption it click continue and then we click OK the analysis comes up very promptly it shows us that our overall mean was 50 to the mean for low SES was forty five point six from medium socio economics it was fifty one point six five and for high SES was 58 point two seven so what our question here is is whether or not the differences that we are observing amongst these means is significantly greater than would be expected due to random sampling fluctuation alone we can look at these means and see that they are different the question is is whether those differences are significantly greater than we would expect due to random sampling fluctuation alone the first thing that we look at is whether or not we have met the homogeneity of variance assumption this assumption is what allows us to pool the variances across these three levels of socioeconomic status in this case since our significance value is greater than 0.05 we do not reject the assumption of homogeneity event variance the null hypothesis in this case is that the variances are equal we do not reject that assumption so we go down and we look at our between subjects of X right here and what we see here the main effect for socio-economic status is listed right here okay so we're just going to go across this the mean of the sums of squares which is the variance estimate okay this is the variance estimate between groups of socioeconomic status is three thousand eighty nine point nine forty nine that is the between subjects variance the within subjects variance is right here seventy eight point zero one eight to calculate the F value for this test we're simply going to divide three thousand eighty nine point nine by seventy eight point O one eight and that's going to give us the F ratio of thirty nine point six that is telling us that population variance that is observed between groups is almost forty times greater than the amount of error variance or within subjects variance that has been accounted for this significance value here tells us that a difference of this size of F value of this size would have occurred due to random sampling fluctuation alone in a situation which the null hypothesis was actually true less than one time in a thousand we are going to reject the null hypothesis in this case because this value whether we set our out at point O 5 or point a one or even point zero zero one this significance value is less than any of those so we reject the null hypothesis that the means for the groups within the population are different in the null hypothesis that they are equal so we are assuming that they are to some extent different the partial ADA squared gives us an idea of how much variance in the dependent variable of vocabulary scores accounted for by socioeconomic status so this tells us that about 21 percent of the variance in scale vocabulary score is explained by socioeconomic status the adjusted r-square that is listed here indicates that if we want to take into account what the population estimate would be instead of a sample estimate for adjusts that r-squared we'd have to drop down from point 2 1 1 to point 2 5 because our sample was only 300 instead of thousand so it's still a fairly significant amount of variance that is being explained by this social academic status variable the last thing that we're going to look at is whether or not we can figure out where the significant differences occurred the only thing that the one-way ANOVA tells us is that there is a significant difference in these means to understand where that significant difference lies we're going to look at our post hoc tests so what this tells us is that each one of these subsets in this case represents a significant difference there is no overlap between this group and this group and this group this indicates that all three of these means are significantly different from one another so our conclusion is that there is a significant one-way anova effect that is we reject the the the null hypothesis and we have figured out that the reason that there is a significant difference is because there's actually a significant difference between each one of the three levels that is low is significantly different from medium and high medium is significantly different from low and high and high is significantly different from low and medium
Info
Channel: Scott Snyder
Views: 182,132
Rating: 4.4707379 out of 5
Keywords:
Id: _dcgrDpfFAI
Channel Id: undefined
Length: 9min 46sec (586 seconds)
Published: Wed Sep 11 2013
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.