Full Model testing (Multiple linear regression in SPSS)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
this video will show you how to carry out multiple linear regression in SPSS it uses a data set vo2 data set for the population from which the sample was taken was males and females aged between 20 and 40 and weighing between 135 and 220 pounds the research question we want to answer is to what extent can we predict absolute vo2 max from weight age gender one-mile walk time heart rate body temperature and height on the given population to answer this question we'll create a linear regression model from a sample of data taken from the whole population who we want to predict vo2 max for we do not know what the equation of the model will be for the population but you can make estimates based on our sample and use this to make predictions on the whole population before we can create a multiple linear regression model we need to check there is a linear relationship between each independent variable and vo2 max we also want to check that the independent variables do not have multicollinearity to do this click graphs legacy dialogues scatter slash dot and choose a matrix scatter add all the variables to matrix variables in titles you should add an appropriate title in options we can define how to deal with missing data but this does not apply to us click OK and we get the following output double-click to activate the chart editor to make this bigger go into edit and properties to resize it first we want to look at the relationship between each independent variable and vo2max weight and one mile walk time and gender seem to have the strongest relationships we also note that height and weight have a very strong positive relationship which suggests the problem of multicollinearity we also want to look at the correlation matrix to do this click analyze correlate and bivariate again add all the variables to the variable list we will use Pearson's correlation coefficient with a two-tailed test test of significance again in options we can define how to deal with missing data and include any optional statistics clicking ok we get the correlations table the values in the table back up are comments on the correlation matrix one mile walk time has the highest correlation with vo2 Max gender also has a high correlation weight and height have a medium strength correlation and then age heart rate and body temperature have a very low strength correlation with vo2 max height and weight are shown to have very high correlation as we expected since some variables have low correlation and we have some multicollinearity we may want to remove some variables from the model if they're not significant predictors but to begin with we will consider a linear regression model with all the independent variables included to do this click analyze regression and linear add vo2 Max as a dependent variable and add all the other variables as the independence we will use the enter method as you want SPSS to include all the variables we do not have a selection variable case labels or wls weight in statistics we want to include the part and partial correlations in plots we want to include a histogram and a scatter with the standardized predicted values on the x-axis and the standardized residuals on the y-axis we will use the graphs to test the assumptions linear regression in video three click okay to get the linear regression output first we have the model summary we are interested in the r-squared value and the standard error of the estimate the r-squared value is the coefficient of determination and represents the amount of variation in vo2 max that can be explained by changes in the independent variables the adjusted r-squared value only accounts for significant predictors the r-squared value would increase if we added more predictors even if they do not improve the model the standard error of the estimate tells us how much the observed values vo2 max differ from the values from the regression model we have a standard error of the estimate of naught point naught nine seven one two nine two this is small so suggest that the linear regression model is a good fit however the range of vo2 Max values in the data set is only three point two three five six so we would not be expecting a large value for the standard error as even a badly fitting model would like loath to observe data points due to the small data range next we have the ANOVA table this test the null hypothesis that all the coefficients are 0 and we can predict vo2 max from the mean the alternative hypothesis is that at least one of the coefficients is different from zero we're interested in the p-value which is given in the last column since this is less than naught point naught 5 we can reject the null hypothesis at the 95 percent significance level now we have the coefficients table this gives the estimates for the coefficients of the regression model based on our sample the multiple linear regression equation is vo2 equals ten point zero zero six plus naught point zero zero eight times weight minus naught point naught to seven times eight plus naught point nine three three times gender - naught point two one seven times time - naught point naught 1/4 times heart rate - naught point naught four nine times temperature a - naught point naught naught 1 times height we also have seen significance value for each predictor here a t-test is used to test the null hypothesis that the true population value of the coefficient is equal to zero the alternative hypothesis is that it is different from zero for body temperature and height the p-values are greater than naught point naught 5 so we cannot reject the null hypothesis this means we should consider removing these predictors from the model in summary first we looked at how to check that there is a linear relationship between the dependent and independent variables we also looked at the relationship between independent variables to check for multicollinearity and found that this may be a problem with height and weight next we looked at creating a linear regression model and discuss the appropriate options to get the output that will help us make decisions about the model and check the assumptions of linear regression then we looked at the SPSS output first we had the model summary given the r-squared value which gives the percentage of variability in the vo2 max due to changes in the independent variables we also have the standard error of the estimate which tells us how much the observed values from vo2 Max differ from the values predicted by the regression model we also have the ANOVA table which tests the null hypothesis that a mean model is a better predictor of vo2 Max than our regression model since the p-value was less than naught point naught 5 we can reject a null hypothesis in favor of the alternative hypothesis that our regression model is better finally we use the coefficients table to get the equation of our linear regression model video 2 will cover variable selection methods to remove insignificant predictors and video three will cover interpreting the model and checking the assumptions of linear regression
Info
Channel: Whattest Stats
Views: 37,106
Rating: 4.9200001 out of 5
Keywords:
Id: fijTye9nRDU
Channel Id: undefined
Length: 7min 57sec (477 seconds)
Published: Thu Jul 25 2013
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.