Basic Excel Business Analytics #68: Monte Carlo Simulation Histogram for Construction Project Length

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Welcome to Highline BI348 class video number 68. Hey, if you want to download this file BI348 chapter 11 video 68 and follow along, click on the link below the video. Hey, we're still in chapter 11 doing simulation. And here's our problem for this video. A construction company builds industrial buildings. The projects are sequential. Groundwork then foundation then the framework then finish. The probability distribution for time in weeks for each step in the process are listed below. Calculate the expected time to finish the project, standard deviation, and the estimated probabilities for the possible outcomes for time to finish the project. In particular, managers want an estimate for the probability of finishing in 30 weeks or less. Now I've color coded this and there's a little legend over here. Right here, this is the probability distribution for groundwork. For here's the probabilities. Here's the number of weeks. So this is from past data that we know this about this part of the project. For foundation, here it is. Probabilities and number of weeks. Framework. Probabilities, number of weeks. To finish the project, probability, number of weeks. Now as we've seen in earlier videos, we can build a simulation formula with a random variable based on a relative frequency distribution from past data like this. The formula for step one, groundwork, will have to reflect these probabilities. Inside this formula for the second variable, here's the foundation. That formula will have to randomly create number of weeks for foundation based on these probabilities. The formula for the third variable will have to use these probabilities to estimate randomly these number of weeks. And finally, for the finished part, that's the fourth random variable in this simulation formula, these probabilities will have to be used by the formula to randomly select between 12, 13, and 14 weeks. Now we've actually done this before. We did it a couple times already. We're going to have to use a special extra column at the front of each one of these probability distributions. For the yellow ones, we're going to have to have cumulative probabilities. For the orange ones, cumulative probabilities. And so on. Then we'll simply use lookup and the Rand function. Now this will be the third time we've seen a formula like this. We've just never had to do four of them all in one formula. Now for cumulative probability, we always start with 0. And then now I need to add here 20% and I want to get down to here 20 and 0.45. So I'm going to use equals sum. Click on the first probability. Colon. Close parentheses. And I have to lock that first cell reference to create an expandable range. I hit F4. Control Enter. And copy down. We can see we have an expandable range. And that will work fine as our lookup value for approximate lookup. We have to do the same thing for each one of these. 0. Enter. Equals sum. Colon. Close parentheses. And lock this with the F4 key. Copy it down. 0 equals sum. The very first one colon close parentheses. Lock the first one. F4. Control Enter and copy it down to get our expandable range. Finally, we have to do our fourth one. 0 equals sum. Colon. Close parentheses. And F4. Control Enter. And copy it down. Now our math formula is going to be this. Total weeks for the project X sub G. That's for our groundwork. Plus X sub foundation. That's for the foundation. X sub frame. That's for framing. Plus X sub finish work. Each one of these is the number of weeks from a probability distribution, which will be randomly selected in our formula. All right. You ready for this? Equals-- and we're not going to use V lookup. We talked about look up. When you're doing approximate match, look up oftentimes is a little bit faster. We use Rand function to randomly generate a number between 0 and 1, which will represent cumulative probability from a uniform distribution. Comma. But the fact that we have an array or a lookup table, first column has cumulative. Second column has the weeks we want to randomly select. And if you're not sure exactly why this works, go back a couple of videos. The video on random variables I explain this in great detail. All right. That's the first one. But watch this. I don't want to have to keep typing this. So I'm actually going to highlight all the way to the comma. That's lookup Rand comma Control C. And now, I'm going to come in close parentheses plus Control V. The lookup value is the same. It's just a different probability distribution table we're looking up. So I highlight. That's for the foundation variable. So far, we have groundwork, foundation random variables. Plus Control V. And I'm going to highlight the third table. The first column has got the cumulative, second table's got the actual random number of weeks for this framing part of our sequential project. Close parentheses. Plus Control V. And we'll enter in the last lookup table. Close parentheses. So now we have 1, 2, 3, 4 randomizing variables based on 1, 2, 3, 4 different probability distributions. So now there is one possibility. If I hit the F9 key to random, there's another possibility. So 38 weeks. 30 weeks. 36 weeks. Now we need to repeat this many times. I'm going to have a number 1 in the cell. Point my cursor to that little fill handle. And then when I see my cross hair, right click, drag down, and back up. I point to series. I want to fill a series down the column. Step value one. I want to end at 10,000. Enter. And there we have 10,000 numbers. Now the trick that we've been using because randomizing formulas copied down 10,000 rows take a long time to calculate is of course we use our data table with an empty cell column input. So I highlight. Control Shift down arrow. Control Backspace. And I go up to Data. Data Tools. What if analysis data table. Or the keyboard Alt, D, T. We do not need a row input. We need a column input. We trick it, give it an empty cell, and the data table will try to keep substituting all these values from the column into there. But it won't work because that cell is not connected to the formula. When I click OK it's just a way of tricking data table to much more quickly create all of our randomize simulated total number of weeks for our project formula. If I hit F9 you can see sure enough it is simulating 10,000 each time. Now we want to go ahead and calculate the mean, the standard deviation, min, and max. And we solve last video how instead of using the individual functions we can use the aggregate function. Now the first argument in aggregate is function number. And I've already put the function number over here. 1 is for average. 7 is for standard deviation. 5 is min and 4 is max. So I click there for function number. We're using the second one down here. Comma. Options. We don't need any options so we select 4. Comma. And then we're using the reference here. So I click on the top cell. Control Shift down arrow. F4 to lock it. All we need is that first reference. Close parentheses. Control Enter. And copy it down. So it looks like the average time to finish this sequential project is about 33 and 1/2 hours. And if we hit F9, we get slightly different estimates, but each one is pretty darn close to 35 and 1/2 hours. Standard deviation is 2.8 hours. The min is 25 and the max is 40. Now I want to show you something interesting when you have a probability distribution like this if we did our expected mean formula, which we learn in our prerequisite class and even earlier in this class, we could do the sum product of the probabilities times the actual X values. Do expected value for the first distribution, add it to the second one, the third one, and the fourth one. But actually, I set this table up just like this because I don't want to do four different sum ifs and just use the plus symbol. I'm going to use the single sum product and use array 1 all of the weeks times all of the probabilities. Now to get this to work right, each one of the individual tables has to add up to 1, which of course it does. But that formula right there could get our answer for our bosses to one of the question which is what's the average time for finishing this sequential project or what's the expected value. And notice when I hit the F9 key, it's virtually exactly the same. Now as we mentioned earlier when we did other simulation examples, that's fine and dandy. And even these statistics are helpful. But seeing the full frequency distribution with all of the probabilities is oftentimes the main reason that we do simulations like this. So that's a pretty nice formula. But we want to see the probability of getting 33.6 hours. Now before we look at our frequency distribution, we can also calculate a formula to answer the boss's questions, what's the probability that the number of weeks is less than or equal to 30? Equals count ifs. Click in the top cell. Control Shift down arrow. Control Backspace. Comma. And now I need to in my formula join the comparative operator to that X value. Double quote less than or equal to end double quote. Then I have to join it using the ampersand to that 30. Close parentheses. Divided by. And I already did my count down there. 10,000. Control Enter. So 14.85. And if I hit the F9 key to randomize our simulated values they get slightly different values for the probability that a particular project will take 30 or fewer hours. Now again, these are individual statistics. We like to do an entire frequency distribution and relative frequency distribution. Now I already did this because we've done this like five times already in this chapter. We listed our upper values for each one of the frequency array function. We entered our frequency array function and calculate each one of the frequencies. Then we calculated our probabilities. Now in this video, which I haven't done earlier, I do want to create a frequency distribution because sometimes you like to see the actual shape of the data. And we can kind of see it. It looks like it's got somewhat of a symmetrical shape with highest one right around where our mean is. So I'm going to highlight the column with the label and just the frequencies. Go up to Insert. Column. Column. Click on the columns. Control 1 to open up our task pane. I'm going to change the gap width to 0. Go over to the fill. Fill. Vary colors by point. Border. Solid line. Let's give it some black. We did a lot of this back in chapter 2 I think it was. Definitely need to go to our green plus and say axis titles. I have that one selected with a solid line. So I simply type in equal sign. That shoots me up to the formula bar. I'm going to click on the relative frequency one and Enter. Now, we click on the horizontal axis. Equals sign. This X variable. And Enter. Right click because that's not the correct set of labels down here. Right click. Select data. And I want to change and edit the horizontal category. I click Edit. And now, I'm going to highlight, including that empty cell at the end. Click OK. Click OK. That's already looking much better. Now, I want to do one last thing. I want to come over here and say data labels. Click on the arrow. More options. That didn't work. It's supposed to pull it right up to this series. Label options. And this is the one I want. But before I do that, I want to uncheck value and click on Values from cell. And I'm going to actually list the relative frequencies or the estimated probabilities. Click OK. Now I want to do something else. I want to come over to the properties in our task pane. Alignment. Come over to text direction, and we want rotate all text to 70. Now, I want to click and try and select the entire inside part of the chart, and click the middle circle and drag down a little bit. Click on the label. Maybe I want to come up to home and change it to 10. 10 and Enter. Now I have a visual picture and the actual probability. So I can hit the F9 key and watch my histogram with my relative frequencies or estimated probabilities change. So now we can answer whichever question our boss might have. There's the average time. There's the probability of less than 30. And we can pick out any particular interval here and look at the probabilities or look at our histogram. All right. So in this video, we saw how to take 1, 2, 3, 4 different relative frequency distributions, build four different randomizing variables into our simulation formula, and then answer some questions about number of weeks to finish a sequential project. All right. So in our next video, which will be our last video for chapter 11, we'll see how to calculate the probability that a particular team will win the World Series using simulation. All right. We'll see you next video.
Info
Channel: ExcelIsFun
Views: 12,379
Rating: undefined out of 5
Keywords: Excel, Microsoft Excel, Highline College, Mike Girvin, excelisfun, Michael Girvin, Mike excelisfun Girvin, Excel Magic Tricks, Business Analytics, BI 348, Data Analysis, Monte Carlo Simulation, RAND function, Data Table Feature, Column Input Empty Cell, Construction Project Length, Sequential Project, Histogram based on simulated values, LOOKUP and RAND function, Histogram Chart, Histogram Column Chart, Data Table feature run a simulation
Id: zoYFPha_RFE
Channel Id: undefined
Length: 15min 55sec (955 seconds)
Published: Mon Dec 28 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.