Welcome to Highline BI
348 class video number 54. If you want to download this
workbook, BI 348 chapter five, and follow along, click on
the link below the video. Hey, last video we talked
about charting a time series to help us pick
the right forecast method. Now in this video
we're going to start on the sheet, basic
forecast, and we're going to talk about
different ways to forecast and different ways
to measure forecast error. Now we're actually going to
start over in our PowerPoint. So you can download
this file also. Forecast is simply,
predict future values based on past patterns. Now there's a caveat,
because forecasting is about the unknown future. So although we try to
forecast accurately, we never know if the patterns
we have seen in the past that we are using to
make the prediction will hold into the future. Not only that,
but you never know what will happen in the future. Hey, a couple famous people--
Yogi Berra, in particular-- said it's tough to make
predictions, especially about the future. Now that's a little
bit funny, right? But it's true. We actually don't know
what's going to happen. So we're going to do our best. We're going to use past
data to predict the future. But if those patterns
from the past don't hold, then our predictions are
not going to be very good. Now I want to go to slide 10
to talk about different methods for forecasting. We're going to talk about,
in this video, the naive method, or most recent,
averaging of past values, moving averages, and
exponential smoothing. We'll see all four
in this video, and we'll use the same data
set and see which one, in fact, is more accurate. And the way we do
that is, next slide. Forecast accuracy, or forecast
error-- whichever way you want to look at
it-- and we're going to look at four
different measures. Mean forecast error,
mean absolute error, mean square error-- that'll
be the main one that we use to compare
different methods-- and then a relative measure
called mean absolute percentage error. All right, let's
go over to Excel. Now we're going to use this same
data set-- 12 months-- that'll be the time variable-- and
then the sales in $1,000. We're going to use
the same time series for our four different methods. And the first one
we're going to look at is the naive method or the
most recent forecast method. And you can see, over
here, we've plotted this. And this looks like a horizontal
or constant time series. Now it's going to seem strange
here, because forecasting-- what is forecasting? Well, we know 12 periods, right? And we really want
to estimate the 13th. But once we estimate
this 13th, we actually can't measure if
that's accurate, because it's about the future. So the way we measure forecast
accuracy or forecast error is by using our method
on a bunch of past values and then measuring what the
actual error or accuracy is. So this is going to
seem strange at first, but we'll try and
get the hang of it. Now the most recent
is very simple. For period number two, if we
were sitting in period number one and we wanted to
forecast the next period, we'd just use this value. So the naive method or most
recent is, hey, for period two, we're just going to get
the value from the past. Control-Enter and copy it down. That becomes our forecast. Now we're going to use this
and do a bunch of steps to measure the forecast
accuracy or the error, but estimating for the next
month-- month number 13, right here-- we
simply say, hey, we're going to take month number 12. Now if you look over
here to our chart, you can see that it's
pretty choppy, right, because we're using the
actual value from last period. So in this period, even though
our value turns out to be 25, our forecast was 30,
because we used the value from the last period. Now in subsequent methods, down
here-- our next three methods-- this forecast will
be much less choppy. And there will be
less forecast error. One thing you can say about
the most recent, if it jumps to a new level, of
course, this method would adjust quickly,
whereas methods of averaging don't adjust quite as quickly. Hey, let's calculate
our forecast error. And this should
seem similar to what we did earlier in the
class where we calculated standard deviation-- like, we
took particular value minus x bar, or we calculated errors for
linear regression-- particular value minus the linear
regression predicted value. Well, here, for
forecast error, we're going to take the particular
value minus our forecast. All right, so you can
see, it's positive 4. That means we took our
actual value, subtracted the forecast-- our forecast
is actually under-predicting. We get a net of 4. When I copy this
down, we're going to get some plus and
some minus, because if it is a random fluctuation
around the mean, like this, then some will be
above and some will be below. Here, minus 5 means the actual
forecast method over-predicted. Now for our first
measure, we simply add up all the forecast errors--
Alt-Equals-- and then Enter. And now we want to
calculate a type of average. And each one of these measures
will be dividing by something. Now notice, we can't
divide by the count, over here, because this is 12. So we need to
actually divide by 11. And the way we do that is
we say n equals the count. So I'm going to say
the count of all these. And our k is going to
be the number of periods at the beginning where we
didn't make a forecast. And I'm actually going to do
this in a very strange way. I'm going to use
the Rows function. And now I'm only going
to click on one cell. How many rows are
there for cell C4? There's only one. Now later, we'll have multiple
periods at the beginning. And so we'll simply highlight
it and it will tell us-- the formula will tell us
how many rows there are, even though there's
nothing in that cell. So of course, we're
going to get 1, because there's only
one period we didn't make an actual forecast for. So n minus k will be our
denominator in each one of these calculations here. So I'm going to say,
equals n-- the count of all the time series--
minus the number of forecasts we were not able to make. And so we get 11. So for our mean
forecast error, we're going to say, hey, the total of
all the errors divided by our n minus k. Now there's one big drawback to
mean forecast error, of course. That is, is some
values will be above and some values will be below. So as you add these, they'll
tend to be an offset. So of course, in
earlier in the class, when we calculated
deviations and residuals, we always tended to square them. Our next measure is
going to be called mean absolute error
or sometimes-- not in our textbook,
but in other textbooks-- they'll say, mean
average deviation. And, simply, we take
the actual error amount. And we want the absolute value. So we use ABS-- Control-Enter. And that tells us
the distance from 0. Now we can add them up. There's the total of
the absolute value for all of our forecast errors. And now we simply take that
amount and divide by n minus k. Now we could actually go over
to our PowerPoint for a moment, if you want. Slide number 19 has
a formula for this. There it is-- mean
forecast error. We sum everything up
from time equals k. Remember, k was 1. So k plus 1 means 2. So you go from time 2 all the
way to n-- all of your errors. e sub t is the variable we use
to stand for forecast error. And then we divide by n minus k. Next slide is mean
absolute error. We do the same
summation of our errors except for-- boop, boop--
we take the absolute value and then divide n minus k. Back over to Excel--
another measure, and the one that we're going
to use most often to compare different methods--
is we're going to start by squaring the
error and copying it down. Then we're going to
add it up-- Alt-Equals. And then we're going to
take that total of all the squared forecast errors
and divide by n minus k. And so we get 16.67. If we look at our formula over
in our PowerPoint on slide 21, there it is. The sum from period,
in our case, 2 to n, there's the error squared
divided by n minus k. Back over to Excel here-- so
mean square error and mean absolute error, these
are actual measures-- a type of average of the error. So really, these are
a measure of errors. But the lower both of these
are, the more accurate the forecast method will be. And what we will do
with MAE and MSE, is we will compare these two
to our other three forecasting method and-- methods. And, in particular, the mean
square error we will compare. And whichever one is
lowest, that's the forecast method that we will use. Now there's a problem
with mean absolute error and mean square error. If we were to compare these
measures of error or accuracy to a different time series
that had a different time period, like day or year,
we couldn't do that. So what do we do in that case? We create a relative measure. So for percentage
error, I'm simply going to take the forecast
error and compare it to the actual value. So equals the
forecast error divided by hey, the actual
value-- Control-Enter. And when I copy this down,
I'll get some that are negative and some that are positive. I'm going to add this
up but, of course, that has the same, basic problem. There's some offsetting
of the pluses and minuses. So we will take the absolute
value of those percentages, ABS-- Control-Enter. And copy that down. Now once we have all
of the absolute values of the percentages, we can
add those up-- Alt-Equals. There's the total. And for mean absolute
percentage error, we say equals, hey, the sum
of all the absolute values for the percentages
divided by n minus k. And that will be a
relative measure. So now we can compare the
accuracy of this forecast method for months
to, say, a forecast method that was day or years. Now mean square
error is a measure we're going to use,
in particular, when comparing different methods. So instead of using
a formula like this, which is looking at the count,
and the sum, and a whole column of forecast error
squared, and then over here we're actually
taking the difference between particular value and
forecast for a whole column, we want to see how to do that
calculation in a single cell. And the reason
why is, oftentimes we have the actual values
and the forecast method, or maybe various
forecast method. And so for each actual
values and forecast method, we want to be able to
calculate mean square error. And we can do that. We talked about this in the
prereq class and earlier in this class. Any time you have a
series of calculations-- like that, which is then
followed up by this one, and then this one-- you can
start with the very first one and work your way backwards. So watch this. I'm going to simulate this
whole column right here. And I'm going to use the
Sum Product function. Now Sum Product-- the product
part means multiplying. The sum part means add. We've done this
earlier in the class, where we took multiple arrays
and multiplied and added them. But we also noted that if
you use just a single array, this function, Sum Product,
can handle array operations without any special keystroke. And our single-cell
calculation will involve an array operation. Now watch this. I'm going to take the
entire set of values-- same number as number of
forecasts-- and notice that's a whole range of
values, or an array of values. Then I'm going to do
the math operator minus. Right now, as soon as you
have not a single cell but an array of values or range
of cells and some operator, you're doing an array operation. Then I'm going to say, hey,
the actual forecasted values and subtract that whole range. Now if we highlight
this and hit the F9 key, notice I just simulated
the entire column of forecast errors. 4 minus 4, 4 minus
5-- there it is. 4 minus 2, 4 minus
5, all the way down. So right now I've eliminated
that extra column. Now I'm going to
Control-Z. What do I do with the errors over here? I square them. So I'm going to do my second
array operation, which is take the result of that
subtraction array operation and then caret 2 to square it. Now it will do a second array
operation square and all those here. I'm just going to click that. F9 to look at it. And there I've simulated
this entire column. So 16, 4, 16, 25, 16, 4, 16, 25,
all the way down-- Control-Z. Now I can simply, to get
that value right there, close parentheses. And the Sum part will add. Again, this is using
the Sum Product function just because it can
handle array operations without any special keystroke. If we didn't use
Sum Product, we'd have to use the Sum function
and a special keystroke, Control-Shift-Enter. But I can simply enter that and
that's the numerator part-- F2. Now I need to do n minus k. But I'm going to be
smart about this. n minus k is simply the
count for the number of forecasted values. And I'm going to divide. And then I'm going to
use the Count function. And I want to count the
actual number of forecasts that we made in the first place. And that really is the
meaning of n minus k. Now remember, the
point of this formula is that we wanted ranges to make
our final calculation of MSE only from the column
with the actual values and the actual forecast values. So when I close parentheses
and Control-Enter, boom, there we go. All right, now we're
going to go down and do our next forecasting method. And then after we
complete all four, we'll compare and contrast. This is called averaging
past values forecast method. And what we want is, here, I
need to take the first value. When I get down here, I
need to average 24 and 28. When I get down
here, I need to take the average of 26, 28, and 24. So when I get all the way down
to the last one, that cell, right there, will
average all of these. So instead of doing that
manually, there's an easy way. We can use the Average function
and a special kind of range. Now I'm going to click in the
very first cell and then type a colon, which brings in the
cell reference-- the same cell reference a second time. And then I'm going
to close parentheses. Now I'm not going
to copy it down yet. I'm going to put my cursor
touching the first cell reference and then hit the F4
key to lock just the first cell reference. Now notice that the dollar
sign is in front of the 27. But this 27 does not
have a dollar sign. That means, as I copy
this formula down, that 27 will turn to 28, which
will include that one-- then to 29, which will
include that one. It's called an expandable range. Control-Enter and copy it down. And you can go to
any cell and hit F2. Notice it's expanded. And there's that 28. The dollar sign in front of the
27 stops the 27 from moving. No dollar sign in
front of the 28 means, when I hit Enter
and F2, that moves to a 29. So all the way down,
any cell you pick, that's called an
expandable range. And that is another
way to forecast. Now if we look at
our chart over here, we can see we're already
getting a smoothing effect. The actual forecast--
because it's not up and down, up and down-- we're
probably going to get a much smaller
forecast error. Now let's come over and
calculate our forecast equals-- I'm going to take
the actual particular value minus the forecast. And copy it down. I will definitely get
some that are positive and some that are
negative-- Alt-Equals to add it up, and Enter. Now I come down
here, and whoops, I have to do equals Count--
counting all the values. That's the n, k, equals--
and I'm still going to do that strange little Rows. We still haven't
seen an example where that's kind of any benefit. But that'll give us 1, here. n minus k-- that's the
number of forecasted values. All right, so now I can come
to mean forecast error equals-- hey, there's the
total of the forecast errors divided by n minus k. Again, that has a problem
because the pluses and minus tend to offset. So we're going to use
the absolute value function to get the absolute
value-- Control-Enter. Copy it down. And I'm actually going to
take that formula, which is adding up the 11 cells
above me, and copy it over. That way I don't have to
keep doing Alt-Equals. Now I can come to
mean absolute error equals the total of
all the absolute values of errors divided by n minus k. And there's our
mean absolute error. And you can already see, as
compared to our naive method, it's a lot smaller. Mean square error-- I'm
going to come here-- equals forecast error,
caret 2, Control-Enter. Copy it down. Now I can come
here-- equals the Sum of forecast error
squared divided by R11. I'm going to do the
same array formula. As practice, we're going
to take Sum Product. Why? Because it can do
array calculations without Control-Shift-Enter. Open parentheses-- I take the
actual values-- the same number of actual values as
there are forecasts values-- array operator
minus-- then I highlight all of the forecasted values,
close parentheses, caret 2-- there's our second array-- close
parentheses-- divided by count. And I'm going to count the
number of forecast values. Again, the idea
behind this formula is that we don't
want to have to, eventually, do all
these calculations. But we want a method for
calculating mean square error only from the original values
and our forecasted values-- Control-Enter. So we could see
mean square error-- 8, 10 compared to
16, 27-- so far that's the method
we're going to prefer. Now if we wanted to
calculate our percent error, there's the forecast error
divided by our actual sales. Copy it down. Now I come over
here and calculate, equals ABS-- the absolute value. Copy it down. Now I can come
over here-- equals the Sum of all the absolute
values of those percentages. And I divide by 11. And there's our
relative measure. Now we can clearly see
this forecast method, based on our past values, as a
much smaller mean square error. So we would choose this one. And our forecast
would be, equals-- and it's the average
of all the past values. And so there we go-- 26.25. So we have an estimate
here of 26.25. We can see it there. Our previous method was 29. We can see it there. Now we want to think
about this average here. Notice, we just took
all of the values. Now one disadvantage
to doing this is if there is a
sudden change in level, then this method will
be slow to pick it up, whereas the recent method will
almost instantly pick it up. So instead of averaging
all of the past values, we're going to talk about
our third method, which is moving average forecast method. And it's pretty straightforward. We're just going
to say we're always going to average the
last three values. That way, if we have
a change in level, this method will pick
it up much more quickly than averaging all
the past values would. I'm going to start out by
equals Count all the values. And when I hit Enter,
this is where it equals Rows is going to be a benefit. I need to get the information
about how many values were missing. And I don't want to
type the number in. So I'm simply going to
highlight these three empty cells and Rows will
say how many rows are there from 51 to 53. There's three of them. So now we can calculate equals
our n minus rk, and that is 9. That'll be our denominator
for all these calculations. All right, so the forecast--
it's pretty straightforward. We just use the
average function. And very carefully,
you don't want to include the one in
the current period, because this is the
one being forecasted. We want the last three. Those are relative
cell references. As I copy this
down, it'll always look at the three
previous values. So when I copy this down, you
can see at any particular time, it's one over and it's
got the three above. So in the very last period,
sure enough, the three above. If we were going to do the
actual forecast for period 13-- equals Average--
well, we're just going to look at the last three. And there our forecast
would be three. Now our forecast
error equals-- there's the particular value-- minus our
forecast using moving average. When I copy this
down-- Alt-Equals. And I'm going to copy
this one over right now. Ooh, it looks like
we're getting 0. We don't even need to
do this calculation. But as a template,
we're going to say, hey, equals 0 divided by 9. Now I'm actually going to
lock that with the F4 key, because this formula,
as I copy over, will always take the--
whatever the total is from above and divided
it by that n minus k. So I'm going to
drag this over here. Absolute value equals ABS. And copy it down. And so there I go. There is the total already done. And there's the division,
divided by n minus k, already done-- so 2.66. Squared error equals--
go get our forecast error and square it--
Control-Enter-- copy down. Ooh, and there's our
mean square error of 10. So for comparing averaging
all the past values, averaging just
the last three, it looks like, for this
particular data set, that 8, 10 mean squared error
is going to be smaller. So we would use
averaging all of them. Now we'll see an example--
all of these data sets look like they're pretty
constant horizontal. We'll look at all four
methods with a data set that jumps in just a moment. Now when we want to do our
formula for mean square error all in one cell using
the Sum Product, we want to do the same
calculation, except for, notice, we're only going to
highlight nine of the values and subtract our actual
forecasted values-- caret-- close parentheses,
caret two, close parentheses and divide by the count
of the actual values that have been forecasted. There's 9 of them. And then I'll get
exactly the same thing. Our percent equals
the forecast error divided by our actual value. I can copy that down. Equals ABS, Control-Enter,
and copy it down. Now I can come here. I'm actually going to copy--
Control-C and Control-V-- and, of course,
that one will work. So there's our relative measure. And we can see here--
we started here, and it looks like
it's pretty smooth. Now it looks like this one's
probably a little bit smoother. And we can definitely
see the error. But again, this moving
average will pick up any changes in levels more
quickly than averaging all of the last ones. Now we want to look at
our last method here, exponential smoothing. And it's a type of
weighted average. We're actually going to have
our first forecast value looking at the most recent. But all subsequent forecast
will be a weighted average. We'll have what's called a
smoothing constant, or alpha. Right here, I have
it set at 0.4. But this number
can go from 0 to 1. And it will weight. It'll actually weight
the past particular value a certain amount and the past
forecast a particular amount. And what we'll do is, we're
allowed to change this, after we create all these,
and adjust it until we find the min mean square error. So the way this
works is, equals-- and I'm going to take the
last particular value-- and that's a relative
cell reference-- times the weight, F4 to lock
it, plus the past forecast value, times-- and it's
the complement of alpha, the smoothing factor-- F4, close
parentheses, Control- Enter. And copy it down. Now if we think
about this, right now what this is saying is,
this little bit is saying, I'm going to take 40% of that. And this little bit is saying,
I'm taking 60% of that. So I want to change this value. And I'm going to start at 1. Now think about that. If this smoothing factor
is weighting this, if I change this to
1-- that means 100%-- it's exactly like
the naive method. Hey, only look at
the most recent. But as I decrease this, it
increases the percentage that the past
forecast is taking. And it will smooth the line. So I'm going to change this
to-- and before we change that, look over here. We see the line. It's exactly like our most
recent forecaster naive method. Now when I change
this to 0.8, you can see it, sort of,
starts to smooth out. 0.5-- you could see these values
here, and it's smoothing out. Now if I go to 0.4--
smoothing out more. 0.2-- and what I want to do,
instead of looking at the line and eyeing it, I now want
to create mean square error. And I'm not even going to
bother with these columns. I'm just going to do
our one single cell formula that looks
at the actual values and the actual forecasts. Are you ready? Equals Sum Product,
open parentheses-- I'm going to do the actual
particular values-- exactly the same number as
the number forecast we have-- minus actual
number of forecast, close parentheses, caret
2, close parentheses. That'll give me the total of
the square forecast error. But I need to divide by the
number of actual forecast. I'm going to use
Count, Control-Enter. There's the mean square error. And as I change
this-- 0.5-- whoa, it looks like it went up a lot. 0.05-- that would be a 5% weight
for the last particular value, and a 95% for the last forecast. Now that looks like it's 10--
0.15-- 15-- 0.2-- somewhere around 0.2. Now, actually, I'm going
to show you a cool trick. Next chapter we get to talk
about linear programming and use Solver feature. But we're going to use the
Solver feature to tell us when is this exactly
at its minimum value, by changing this input
cell, because remember. This formula is looking
at all these values. This formula is looking
at those values and this. Now earlier in the class
we saw Goal Seek, where we could point to a
formula and say, hey, I want that to evaluate
to a particular amount by changing an
endpoint input cell. Here I'm going to tell
Solver, change this to, not a particular amount, but
the minimum possible, by changing this input. So you ready? It's under Data. And we've been using
data analysis a lot. Let's use Solver. Now we're going to do a
very simple example here. Set Objective-- that's the cell. What do we want to do to this? Now, actually, the
default is Max. I already had it on Min. But we set it to Min. And then it says-- and
this comes empty-- it says, by changing what cell
or cells-- later, we'll see how to actually
change a number of cells-- I'm going to say just that cell. And using Solver, for just this
simple example, that formula-- min amount, by changing that,
when I click Solve, it says, hey, Solver found a solution. I'm going to click OK. And there it is. That's the smoothing
constant that will minimize our mean square error. Now our actual estimate
for period 13 would be, hey, I'm going to take
the most recent times our smoothing factor, plus
the actual previous forecast, times the complement of
our smoothing factor. And that is our
forecast, right there. Now down here, you can see
I've listed our four methods. So our forecast for
each method, and then our mean square-- and
we can see 8,10 means that the averaging all of
the past values forecast method, that will give us the
smallest mean square error. Now that's given that
we have this data set-- a horizontal or
constant data set. Now, when come back
in our next video, we're going to go to the
sheet, basic forecast two. We're going to have a summary
of all four of these methods. But we're going to have
a different data set. And we'll see that,
using mean square error, we can pick an appropriate
forecast method. All right, we'll
see you next video.