Chapter 10.4: Multiple Linear Regression: Controlling for Variables - An Introduction

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome to chapter ten point four from Stevens's introduction to statistics the think and do book in this chapter we're going to discuss multiple linear regression it's just an introduction in hopes that this little cliffhanger may inspire you to take another course in statistics we're going to control four variables in our linear regression and the whole motivation for this is that let's see the urn in the previous section we made a statement actually from section 10.2 we said correlation does not prove cause and effect right we made that very clear we said doesn't mean there isn't cause and effect but correlation doesn't prove it but isn't that really what we want to know don't we want to know how the world works I mean you know random patterns and associations and cinema correlations those are all well and good but I want to know how things work I don't know what causes something else to change right and one of the ways to help obtain this cause and effect relationship is through controlling for a variable in my linear regression when I say we controlling for a variable I mean I'm actually going to include other variables so instead of having just say the one variable where we did you know cricket chirps of temperature we might have you know another variable in there quick and jerks temperature time of night or something like that right so we can include other variables that's what we mean by controlling for variables it may sound like we're actually reducing our sample size so that all all measurements were taking a certain time or the variables you know the sample the items in the sample are all the same but that's not necessarily Kasich and when you control for a variable you actually just introduce it into your model so here's an example where we will we will do that eventually we don't start there we start with this preliminary example incumbent campaign spending so what we have here is a list of 15 elections where incumbents were involved incumbent is where the person holding the position is also trying to win the election for that position the for the following term so there's 15 of them and in the first column we have the campaign expenditures in thousand dollars so this shows a campaign expenditure of 70,000 70 point six seven thousand dollars right and then the second column gives the incumbents performance in terms of the percentage of votes right and so if you take this data and put it in a scatterplot right here we'll put the expenditures down here because it's commonly believed it campaign expenditures cause changes in the outcome right so we'll put that down here expenditures and we'll put the performance in terms of percent votes on the y-axis and what you'll notice right away is that it is a decreasing relationship and if you get the correlation coefficient you get a negative 0.61 one and if you look at number 415 for sample size what you'll see is you have to beat point five one four so this relationship is the correlation is indeed significant alright because the absolute value or about remember we have to take the absolute value so we get actually positive 0.6 one one two compared to our critical value so we have a significant negative correlation between incumbent campaign expenditures and how well they do in the election so that's sort of as counterintuitive right we always thought the more money you spend the better you do but in this case looks like the more money you spend the worse you do and in fact you look at the slope negative 0.2 what this says is that for every extra thousand dollars and incumbent spends on the campaign he or she can expect to lose point two percentage points all right not very bad so so I'm there ask yourself is the extra spending causing the incumbent to do worse in the election and that's you know no matter how you come around how you work around this that's hard to believe that spending the extra money causes the incumbent to do worse so is there a lurking variable perhaps and if we play around and think about lurking variables that as far as an incumbents reelection goes probably the number one factor is how well he or she did in their first term right they had a terrible first term that's going to really hinder their ability to win the second term if they did really well then they're they have a good chance of of winning the next term so so what if we include that so we're going to control for how well they did in their first term we're going to control for the actual quality of their first term in terms of their pre election approval rate so that's on the next page so now we have two independent or causative variables we have the incumbents pre election approval rate we have their campaign expenditures just like we had before and then we have the performance of how they did in the election and so what this first one shows is that the higher the pre election approval the less they spend on their campaign which makes sense if you're doing really well and everyone likes you you don't have to spend as much on your campaign so that and that is in fact a significant correlation if you check in the back of the book so we have a significant correlation there that makes perfectly good sense and then if you look here at our second scatterplot it's as the higher the pre-election approval rating right the better the incumbent does in the election which also makes perfectly good sense the world is starting to make sense again here right and in fact if you look at our that's 0.98 zero that's a significant positive correlation so really it's the pre-election approval rating that we were missing in our last analysis when we just compared because in the last comparison all we had was campaign expenditures and incumbent performance and basically what it showed was that the more they spent on a campaign the worst they did but that actually makes perfectly good sense when you bring in the pre-election approval rating because the lower campaign expenditures was a caught it was caused by doing well in the pre-election approval rating which then also caused an increase in the performance in the election right so suddenly the world makes more sense we include some variables we control for variables by including the pre-election approval rating in our model and things come out to have a little more meaning and what we can do here another we have two independent variables we can have a multiple linear regression equation and again we need software to UM to calculate these variables but what it says is the percent votes in the election all right is equal to 0.8 that's one slope times the approval current approval rating plus 0.1 0 times the spending plus the intercept right that's m1 x1 m2 x2 so there's two slopes there's the point eight and the point one what this point eight is saying is that for every increase in approve for every percent increase in approval the incumbent can expect a point eight percent increase in the election that's good I mean certainly demonstrates that approval pre-election approval rating plays a big factor but then what it does also show is that there's a positive correlation between spending and election right but that says is that for every extra thousand dollar spent the incumbent can expect point one of a percentage point increase so it's not a very it's not a big increase but at least it's positive it makes sense if you're going to spend more money on your election that should help in some sense and want it in some way and what this says is that it does but it's not nearly as important as the approval rating right so suddenly the cause and effect makes much more sense it all fits together nicely but we needed to control for that variable of pre election campaign spending and this is called multivariable regression multivariable correlation and it would be in a second course in statistics so hopefully this has inspired you to maybe consider taking such a second course so that wraps up chapter ten we'll do the summary worksheet and move on to chapter 11 it was a pleasure having you here and I will see you in the summary worksheet bye
Info
Channel: Scott Stevens
Views: 52,972
Rating: 4.8498292 out of 5
Keywords: Statistics (Field Of Study), Statistics, Introduction to Statistics, Think & Do, Stevens, Scott Stevens, worldwide center of mathematics, Chapter 10.4, Regression, Mulitple Linear Regression
Id: pcObydOsMXc
Channel Id: undefined
Length: 9min 48sec (588 seconds)
Published: Fri Aug 09 2013
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.