Basic Excel Business Analytics #48: Data Analysis Regression feature

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Welcome to Highline BI 348, class video number 48. Hey, if you want to download this workbook, BI Chapter 04 and follow along, click on the link below the video. Hey, we've done a lot in this chapter. We've been doing linear regression, scatter plots, creating estimated equations, R squared, SST, SSC, all sorts of calculations. And we did it all longhand. And we did it with Excel formulas and functions. But we got to look at a great feature. Here's our x, that's weekly ad expense. And here's our y, weekly sales. And we're hoping to predict weekly sales based on weekly ad expense. We're going to do all the calculations we did so far by simply taking our data set, going up to Data. And if you have Data Analysis feature added in, you can use Data Analysis. If you don't have it added in, its easy. Go to File, Options, down here in Add-ins, you click down here, Go. And make sure that checkbox is checked. Escape. All right, so you're ready? I'm going to click Data Analysis. And down here is the Regression feature, click OK. Now we're going to make sure we got our y values. And I definitely want a field name at the top. Control Shift Down Arrow to highlight out all the way down. Tab, the x input. Let me use Control Home to jump up back up to the cell A1. Click in B1, Control Shift Down Arrow. So we have y and x. Now we only have one x here. Next couple of videos, we'll do multiple regression with multiple x's. And it's just as easy as putting multiple columns in. And it will do simple linear regression with a single independent variable x or multiple regression. Now we definitely always what field names, because the output is hard to read unless you have field names at the top. We definitely want to click this when we have field names at the tops. Confidence Interval, we'll talk about that later. But I'm going to do all the features that we should use, even though we're not going to look at all of them now. So I'm going to put 99 here. By default, Confidence Interval comes up as 95. So if you put 99 and then it does one for 95 and 99. Now I want residuals-- we talked about that. That's the difference between the actual y value and your predicted value-- and residual plots. We're not going to talk about them in this video, but a couple videos ahead we will. But that's how we want our dialog box to set up, except for one thing. You've got to tell it where to put this. I'm going to say, hey, go to E1. Now one thing about this dialogue box, if you run different regressions over and over in the same workbook, it'll always remember whatever you did last. So you've got to make sure and visually look through and make sure you've got all right ranges and cells. And then click OK. Now I'm going to mess this up, because that is not the way it comes when you spit it out. It comes like that with all the columns squished up. So you actually always want to highlight the columns, from whichever the start is, all the way to the end. And then double click between any two columns. Double click on that Best fits it. Now I'm going to drag the residual plot down here. We'll talk about that awesome automatic charting later. But let's just go through and see the things we already know. I'm going to highlight these three. And I"m actually going to add some yellow. And I'm going to put some notes off the side. Now multiple r, that really is r, or correlation. R squared, we know what r squared is. Standard error, we just talked about that in our last video. That is standard error of the estimate. Observation, that's our sample size. Some of the things we talked about in last video-- and I'm going to expand the size. I'm going to change the number formatting here. So Control 1. Come over to Number. Use a comma. And I'm only going to show 0 decimals. And click OK. Now this SS means sum of the squares. So when you see total here and SS up here, that means that intersecting cell is sum of the squares total. When you see Residual, that's error or residuals. Those are synonyms. So when you see SS and residual, you know that is SSE. When you see regression and SS at the top, you know that's sum of the squares of regression, or SSR. Here's our estimate of variance. That's where we take the actual sum of the squares of whichever one it is divided by its degrees of freedom. That's our estimate for variance. So really all of these things so far we have discussed. And then when we come down here, we only have a single x variable. So we have two coefficients, they call it, or estimates of population parameters. So there's the intercept, and there is the slope. Down here, we have our predicted values. Control Shift Down Arrow. And right here we have our residuals all automatically calculated. All right, so that's pretty fast and easy. If you're doing a lot of linear regression, then this is a great feature. The only drawback is, of course, it is static. If you want to change your data and have everything update automatically, that's where our formulas that we learned earlier in the class come in handy. Now we will come back and talk about what f is and the p-value for the f and the p-value for our individual estimates of our population parameters and the residual plots in upcoming videos. But so far in the class, there's a lot of calculations that come out automatically when you use the Data Analysis Regression feature. Now if you come down to DA Reg 3, I have like a cheat sheet here that tells you all the individual cells and what they are. All right, next video, we'll come back and we'll talk about the amazing Line Est function that will actually spit out a bunch of these statistics all in one swoop. All right, we'll see you next video.
Info
Channel: ExcelIsFun
Views: 31,339
Rating: undefined out of 5
Keywords: Excel, Microsoft Excel, Highline College, Mike Girvin, excelisfun, Michael Girvin, Mike excelisfun Girvin, Excel Magic Tricks, Business Analytics, BI 348, Data Analysis, : Data Analysis Regression feature, Data Analysis Toolpak, Correlation, R Squared, Standard Error, Slope, Intercept, SST, SSR, SSE, F Test, Test Statistic, t Test Statistics, p-values, predicted vales, residuals, Residual plots
Id: XqXRnbf_4f0
Channel Id: undefined
Length: 6min 36sec (396 seconds)
Published: Wed Nov 25 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.