Excel 2016 - Scatter Plots, Correlation, and Regression

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
okay here I'm gonna go through creating scatter plots some correlation information and regression so I'm gonna do this with the Old Faithful example and then you guys can practice using the triple hop data and I will post a completed excel file for you to check your answers went there so let's start with creating a scatter plot using Excel you're always going to make sure that you have your data labeled X wise I like to put my X on the left and then my Y's on the right that just helps keep everything organized and easier as you are putting information into Excel formulas you're gonna highlight both of the columns together but not the labels or the headers do insert chart you're going to select the first scatterplot there and then we always want to have our Access labeled so I'm going to choose layout one in my quick layouts okay now I can label this except the x-axis says duration so this is the measure of the duration and eruption lasts in minutes and the y-axis is the time to the next eruption we're gonna call this Old Faithful so that's just getting a scatter plot in there and 10 1 you should have read about some correlation stuff we have positive negative nonlinear and no correlation remember that with correlation we're looking at values that are between negative 1 and 1 and the closer it is to negative 1 or positive 1 the more that the stronger the correlation is 0 we would say there is no correlation or a weak correlation because that means that we don't that the data is kind of everywhere when you are talking about our or that you know correlation coefficient make sure that you talk about the strength the type in the direction so when we're looking at this data we see that it should have a positive it looks pretty strong and it looks pretty linear so let's take a look at what that our value is so if I do equals coral and then notice it says array 1 and 2 so for this I'm going to just go ahead and select my X values comma and my Y values really matter which order you put it in for the correlation you'll get the same answer it does matter as we do some of the other excel functions which way you do put it in so here you can see we get a our value of 0.9 7 8 6 5 9 that is very close to 1 so when I interpret this I would say there is a strong linear or a strong positive linear correlation between the duration that Old Faithful erupts and the mid time to the next eruption okay so let's go ahead and move into that ten to stuff that you read about and that is where you look at creating a linear a model or a linear regression model and some residual plots so let's just start by creating my linear model Excel actually will do everything for you if you right-click on one of those points you'll see add trendline I'm gonna select that make sure your linear model is selected and then at the bottom you should see display equation on chart and display r-squared value own chart so we want to make sure we select those so that Excel tells us what the linear model is and you can see that there on your on your scatter plot so if for some reason my set lab or your blackboard quizzes or assignments ask for more than three decimal places you may have to find the X sorry the y-intercept and the slope if you have to do that so slope and remember slope is M in case you guys see that somewhere I'm just going to use equal slope and notice it's known wise and then known X so you're going to select your Y values and then your X values and enter so we're getting the same exact thing I just have more decimal places there your y-intercept except again you're gonna do your wise first and then your x-values and you'll get the same answer thirty three point six eight two nine and then you can set this up so if I write this as my Y hat remember Y hat is what I'm predicting this equation would be to wait for 8:09 I'm just going to use a couple more places times whatever value I'm looking for plus thirty three point six eight okay I'm actually going to use this about this equation and I'm going to come over here stick it in the top it is I know what I predicted values are to create a residual plot okay so Y hat equals and I'm going to go ahead and use point four eight zero nine time thirty three point six eight two nine using what we found instead of the other just to give us a couple more decimal places now I'm going to use cell referencing to make this next part easier so we're gonna do equals and you can either type in twelve point four eight oh nine or you can reference the slope over here just remember if you do that you have to put the dollar signs around it because you don't want that to change when you do a drag down so I'm actually just going to go ahead and type it in so that you guys can see what I'm putting here times but I am gonna grab my X and cell reference that plus thirty three point six eight to nine so now I have an equation entered in to find what it would predict the next time to the eruption to be and you can see that it's pretty close there so now I'm going to grab this and then just drag it down and these are all of my predicted Y values so I'm going to need this when I create my residual plot I'm gonna just bump in another column here because to do residuals do this as reason equals my y minus y hat so that's our equation do this so y minus y hat let's go ahead and drag down so all of these are the residual points and now we can graph that I'm gonna select all of my X values and then I'm gonna select all of the residuals so because I'm skipping make sure you hit control and select and then highlight down and then we're going to insert a new scatter plot so to insert charts over here scatter plot the first one so this is the residual plot and you can see here we have negative values and positive values for my Y and here's my X when we look at this we really want to see that there is no clear pattern to it that means that there is some kind of linear model or correlation between our x and y values okay so I want you guys to try this with the triple hop data and I will be posting an answer key to it
Info
Channel: Cam Nichols
Views: 82
Rating: 0 out of 5
Keywords:
Id: LjsOQ0sQ-iM
Channel Id: undefined
Length: 11min 48sec (708 seconds)
Published: Wed Apr 29 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.