Welcome to Highline BI348
class video number 68. Hey, if you want to
download this file BI348 chapter 11 video 68 and follow
along, click on the link below the video. Hey, we're still in chapter
11 doing simulation. And here's our problem
for this video. A construction company
builds industrial buildings. The projects are sequential. Groundwork then foundation
then the framework then finish. The probability
distribution for time in weeks for each step in
the process are listed below. Calculate the expected
time to finish the project, standard deviation,
and the estimated probabilities for the possible outcomes for
time to finish the project. In particular, managers
want an estimate for the probability of
finishing in 30 weeks or less. Now I've color coded
this and there's a little legend over here. Right here, this is the
probability distribution for groundwork. For here's the probabilities. Here's the number of weeks. So this is from past
data that we know this about this part of the project. For foundation, here it is. Probabilities and
number of weeks. Framework. Probabilities, number of weeks. To finish the project,
probability, number of weeks. Now as we've seen
in earlier videos, we can build a
simulation formula with a random variable based
on a relative frequency distribution from
past data like this. The formula for step
one, groundwork, will have to reflect
these probabilities. Inside this formula for
the second variable, here's the foundation. That formula will have
to randomly create number of weeks for foundation
based on these probabilities. The formula for
the third variable will have to use these
probabilities to estimate randomly these number of weeks. And finally, for
the finished part, that's the fourth random
variable in this simulation formula, these
probabilities will have to be used by the
formula to randomly select between 12, 13, and 14 weeks. Now we've actually
done this before. We did it a couple
times already. We're going to have to
use a special extra column at the front of each
one of these probability distributions. For the yellow ones, we're
going to have to have cumulative probabilities. For the orange ones,
cumulative probabilities. And so on. Then we'll simply use lookup
and the Rand function. Now this will be the third time
we've seen a formula like this. We've just never had to do four
of them all in one formula. Now for cumulative probability,
we always start with 0. And then now I need
to add here 20% and I want to get down
to here 20 and 0.45. So I'm going to use equals sum. Click on the first probability. Colon. Close parentheses. And I have to lock that
first cell reference to create an expandable range. I hit F4. Control Enter. And copy down. We can see we have
an expandable range. And that will work fine
as our lookup value for approximate lookup. We have to do the same
thing for each one of these. 0. Enter. Equals sum. Colon. Close parentheses. And lock this with the F4 key. Copy it down. 0 equals sum. The very first one
colon close parentheses. Lock the first one. F4. Control Enter and copy it down
to get our expandable range. Finally, we have to
do our fourth one. 0 equals sum. Colon. Close parentheses. And F4. Control Enter. And copy it down. Now our math formula
is going to be this. Total weeks for the
project X sub G. That's for our groundwork. Plus X sub foundation. That's for the foundation. X sub frame. That's for framing. Plus X sub finish work. Each one of these is the number
of weeks from a probability distribution, which will
be randomly selected in our formula. All right. You ready for this? Equals-- and we're not
going to use V lookup. We talked about look up. When you're doing
approximate match, look up oftentimes is
a little bit faster. We use Rand function
to randomly generate a number between 0
and 1, which will represent cumulative probability
from a uniform distribution. Comma. But the fact that we have
an array or a lookup table, first column has cumulative. Second column has the weeks
we want to randomly select. And if you're not sure
exactly why this works, go back a couple of videos. The video on random variables
I explain this in great detail. All right. That's the first one. But watch this. I don't want to have
to keep typing this. So I'm actually going
to highlight all the way to the comma. That's lookup Rand
comma Control C. And now, I'm going
to come in close parentheses plus Control V. The lookup value is the same. It's just a different
probability distribution table we're looking up. So I highlight. That's for the
foundation variable. So far, we have groundwork,
foundation random variables. Plus Control V. And I'm going
to highlight the third table. The first column has got the
cumulative, second table's got the actual random
number of weeks for this framing part of
our sequential project. Close parentheses. Plus Control V. And we'll
enter in the last lookup table. Close parentheses. So now we have 1, 2, 3, 4
randomizing variables based on 1, 2, 3, 4 different
probability distributions. So now there is one possibility. If I hit the F9 key to random,
there's another possibility. So 38 weeks. 30 weeks. 36 weeks. Now we need to repeat
this many times. I'm going to have a
number 1 in the cell. Point my cursor to that
little fill handle. And then when I see my cross
hair, right click, drag down, and back up. I point to series. I want to fill a
series down the column. Step value one. I want to end at 10,000. Enter. And there we have
10,000 numbers. Now the trick that
we've been using because randomizing formulas
copied down 10,000 rows take a long time to
calculate is of course we use our data table with
an empty cell column input. So I highlight. Control Shift down arrow. Control Backspace. And I go up to Data. Data Tools. What if analysis data table. Or the keyboard Alt, D, T. We do not need a row input. We need a column input. We trick it, give
it an empty cell, and the data table
will try to keep substituting all these values
from the column into there. But it won't work
because that cell is not connected to the formula. When I click OK it's just a
way of tricking data table to much more quickly create
all of our randomize simulated total number of weeks
for our project formula. If I hit F9 you
can see sure enough it is simulating
10,000 each time. Now we want to go
ahead and calculate the mean, the standard
deviation, min, and max. And we solve last
video how instead of using the
individual functions we can use the
aggregate function. Now the first argument in
aggregate is function number. And I've already put the
function number over here. 1 is for average. 7 is for standard deviation. 5 is min and 4 is max. So I click there
for function number. We're using the
second one down here. Comma. Options. We don't need any
options so we select 4. Comma. And then we're using
the reference here. So I click on the top cell. Control Shift down arrow. F4 to lock it. All we need is that
first reference. Close parentheses. Control Enter. And copy it down. So it looks like
the average time to finish this
sequential project is about 33 and 1/2 hours. And if we hit F9, we get
slightly different estimates, but each one is pretty darn
close to 35 and 1/2 hours. Standard deviation is 2.8 hours. The min is 25 and the max is 40. Now I want to show you
something interesting when you have a probability
distribution like this if we did our expected mean
formula, which we learn in our prerequisite class and
even earlier in this class, we could do the sum product
of the probabilities times the actual X values. Do expected value for
the first distribution, add it to the second one, the
third one, and the fourth one. But actually, I set this
table up just like this because I don't want to
do four different sum ifs and just use the plus symbol. I'm going to use the
single sum product and use array 1 all of the weeks
times all of the probabilities. Now to get this to
work right, each one of the individual
tables has to add up to 1, which of course it does. But that formula right
there could get our answer for our bosses to one
of the question which is what's the average
time for finishing this sequential project or
what's the expected value. And notice when
I hit the F9 key, it's virtually exactly the same. Now as we mentioned earlier
when we did other simulation examples, that's fine and dandy. And even these
statistics are helpful. But seeing the full
frequency distribution with all of the probabilities
is oftentimes the main reason that we do
simulations like this. So that's a pretty nice formula. But we want to see
the probability of getting 33.6 hours. Now before we look at our
frequency distribution, we can also calculate a formula
to answer the boss's questions, what's the probability that the
number of weeks is less than or equal to 30? Equals count ifs. Click in the top cell. Control Shift down arrow. Control Backspace. Comma. And now I need to in my formula
join the comparative operator to that X value. Double quote less than or
equal to end double quote. Then I have to join it using
the ampersand to that 30. Close parentheses. Divided by. And I already did
my count down there. 10,000. Control Enter. So 14.85. And if I hit the F9 key to
randomize our simulated values they get slightly different
values for the probability that a particular project
will take 30 or fewer hours. Now again, these are
individual statistics. We like to do an entire
frequency distribution and relative frequency
distribution. Now I already did this because
we've done this like five times already in this chapter. We listed our upper values for
each one of the frequency array function. We entered our
frequency array function and calculate each one
of the frequencies. Then we calculated
our probabilities. Now in this video, which
I haven't done earlier, I do want to create a frequency
distribution because sometimes you like to see the
actual shape of the data. And we can kind of see it. It looks like it's got
somewhat of a symmetrical shape with highest one right
around where our mean is. So I'm going to highlight the
column with the label and just the frequencies. Go up to Insert. Column. Column. Click on the columns. Control 1 to open
up our task pane. I'm going to change
the gap width to 0. Go over to the fill. Fill. Vary colors by point. Border. Solid line. Let's give it some black. We did a lot of this back
in chapter 2 I think it was. Definitely need to go to our
green plus and say axis titles. I have that one selected
with a solid line. So I simply type in equal sign. That shoots me up
to the formula bar. I'm going to click on the
relative frequency one and Enter. Now, we click on
the horizontal axis. Equals sign. This X variable. And Enter. Right click because
that's not the correct set of labels down here. Right click. Select data. And I want to change and
edit the horizontal category. I click Edit. And now, I'm going to
highlight, including that empty cell at the end. Click OK. Click OK. That's already
looking much better. Now, I want to do
one last thing. I want to come over here
and say data labels. Click on the arrow. More options. That didn't work. It's supposed to pull it
right up to this series. Label options. And this is the one I want. But before I do that, I want
to uncheck value and click on Values from cell. And I'm going to actually
list the relative frequencies or the estimated probabilities. Click OK. Now I want to do something else. I want to come over to the
properties in our task pane. Alignment. Come over to text direction, and
we want rotate all text to 70. Now, I want to click and try and
select the entire inside part of the chart, and
click the middle circle and drag down a little bit. Click on the label. Maybe I want to come up to
home and change it to 10. 10 and Enter. Now I have a visual picture
and the actual probability. So I can hit the F9 key
and watch my histogram with my relative frequencies or
estimated probabilities change. So now we can answer whichever
question our boss might have. There's the average time. There's the probability
of less than 30. And we can pick out any
particular interval here and look at the probabilities
or look at our histogram. All right. So in this video, we
saw how to take 1, 2, 3, 4 different relative
frequency distributions, build four different randomizing
variables into our simulation formula, and then answer some
questions about number of weeks to finish a sequential project. All right. So in our next video, which will
be our last video for chapter 11, we'll see how to
calculate the probability that a particular team will
win the World Series using simulation. All right. We'll see you next video.