SPSS - General Linear Model (with interaction)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I'm in SPSS and I'm going to have a look at the cholesterol data with a more advanced model now if you're watching this video that I hope you're comfortable with the simple model where we just included two variables together because this one is going to get a little bit more complicated we're going to go back into analyze general linear model unit I clicked multivariate analyze generally on what univariate so I've got my simple model that I started with and now I'm going to add in my other variables so I also had gender as a factor so I'm going to add that in and I had age as a covariance so I'm going to add that into now I need to go into my model and I'm going to just select everything and pop it in as main effects now I've got two categorical variables and I might be interested in if there's any interaction between them so for abdominal fat and gender I might want to know does having greater abdominal fat have more of an impact on one gender rather than another if level of abdominal fat has the same impact on both genders then we will see it here with the main effect if it has a different effect for the two genders say it affects men a lot more than it affects women then we need another term to account for that so we select abdominal fat and gender we want to look at the interaction and add that in now just a word of warning every time you add in another term to your model you need more data to be able to calculate the p-values if you try to if you have a lot of categorical variables it takes a lot of data to analyze it and you may not be able to include all the interaction terms depending on how much data you've collected so if that happens then start with a main effects model where you don't look at any interaction terms and then you might have to add in the interaction terms one at a time and see if they are significant and then take them out again if they're not because you may not have the power in the model to run it with all of the interaction terms to start with so I'm starting with all of the terms in the model in this case and I'm going to remove them if they're not significant contrasts I'm going to take these out because I don't want all the outputs they both at none they are plots now we didn't do this in the simple one and that's because we only had one factor now that we've got two factors I want to look at this possible interaction between them now it can take a little bit of fiddling around to work out what's the best way to draw these plots and sometimes it's easier just to draw them both ways and then to see which one is giving you the best visual description of what's going on so I might try putting in ad fat for the horizontal and gender for the separate lines and add that in as a plot and then I might try doing it the other way gender on the horizontal axis and ad fat for the separate lines and add that in as another plot now I will save my unstandardized predicted values and my standardized residuals means for let's have a look at these this will be easier to see on the plot but let's get out the numbers in case we want them so in one of the papers that we looked at they think they had estimates of effect size so if you want to go into some more advanced analysis you could have a look at that but this is okay for now and we don't need that okay so what we have is a description of our categorical variables now you need to check this box just in case you've done something wrong if you accidentally include a continuous variable as a factor so if you included weight as a factor you would see it listed up here and it would have a different level for every single observation unless you had two observations which were exactly the same and so if this box looks overwhelmingly big it probably means you've included a variable the wrong way then in the next box this is where we have all the interesting things so we have a p-value for each hypothesis about each variable so we have the null hypothesis for AB that is that there is no significance against that the hypothesis for weight is that there is no relationship between weight and cholesterol and we've got strong evidence for that the hypothesis for gender is that there is no difference in cholesterol between the genders with everything also all of these hypotheses are given that everything else is included in the model strong evidence against that age now we have a very high p-value here so we have absolutely no evidence that age is an important predictor of cholesterol when we have already accounted for all these other variables now if we ran a regression of cholesterol versus age just by itself it may turn out to be significant and that's because we haven't accounted for the other variables but with the weight in there it's not important and then we have an interaction term which is telling us that there is a different effect of abdominal fat for the different genders the null hypothesis is that the effect of abdominal fat is the same over the genders so the null hypothesis for each of these is that there is either no relationship or no difference given that the other variables are in the model we have an R squared adjusted of about 95% 0.94 8 which is really really high so it means we are explaining about 95% of the variation in our data with this model here we have our estimated marginal means for abdominal fat just ignore the pairwise comparisons I think and then again for gender so we've looked at the abdominal fat before and as the abdominal fat goes up the mean for cholesterol also goes up now with genders we've got a smaller difference but on average the males are having a higher level than females now this is where you need to choose this is where we did the plot two ways one with AB fat on the horizontal axis with the separate lines for gender this is probably the best way to do it it makes more sense having one line connecting up all the female averages rather than this way sometimes it's interesting to look at them both but let's have a look at it this way and we can see that for females which is in blue as the abdominal fat goes up we are getting an increase in cholesterol but it's not as big as the increase as males are getting so the males have actually started off a little bit lower but with some abdominal fat they've shot right up and then they've gone up even high with severe abdominal fat so this is saying that abdominal fat is having more of an effect on the males than it is on the females and this is the interaction and this is significant as tested by this p-value here add that times gender the x does not actually mean x it means add that interacted with gender and the null hypothesis is that there's no interaction and this is telling us that we've got strong evidence against that and there is an interaction there so we've got this model just about everything significant we could run it again taking out the age because that is not important however I'll let you do that by yourself what I do want to look at is these residuals so I'm going to say full model predicted and full model I'll leave it at zero is it shorter standardized residuals now we can do our charts we'll do a histogram first and check to see if it looks normal for our residuals full model looking at put in the normal curve that's looking a bit more normal now and we're going to have a look at the residuals across the range of the predicted values so we want the predicted values down on the x-axis and the residuals on the y-axis and we can put a line in at zero because the residuals should be around zero which which they will be is just this will help you if you're looking for any it actually looks like there's a bit of a curve there doesn't it possible so something else you can do when you've got more than one covariant in the model so this is when you've got more than one continuous variable in the model sometimes you need to look at these residuals against each of those covariates to see if there's any curvature so we can do this graph again and instead of looking at the predicted values we could do it against weight and that looks fairly scattered about so we would be interested in increasing variance I don't know if we've got enough data at the low end to say that the variance is increasing in this case it's possible certainly doesn't look like there's a curve there which is what I was looking at that for take that out now I'm not really too interested in age because that wasn't a significant predictor anyway and I'm going to take it out of the model just for the sake of argument let's do that and that actually looks quite random as well and the other plots that you can do would be a at P P or a QQ plot for looking at normality when we looked at in regression and we can also look at the the residuals in order of the observations which we did in the regression too so I'll let you do that and so you can also go through and run the whole model again taking out the age because that wasn't significant I'm just scrolling up for our model here so you can run the model again taking age out every time you take something out of the model or put it back into the model the P values can change so you need to do these one at a time if you want to take something out of your model just take out the one with the highest p value or the one you think it's the least important first and then just check what the p values are doing and seeing if anything has changed and then if you need to you can take out more or then you can look at adding in interaction terms if you didn't have enough data to add them all in to start with
Info
Channel: PUB708 Team
Views: 54,784
Rating: 4.53125 out of 5
Keywords:
Id: YjSqRwrN_Ks
Channel Id: undefined
Length: 11min 51sec (711 seconds)
Published: Tue Oct 14 2014
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.