Basic Excel Business Analytics #65: Create Random Variables: Custom, Binomial, Uniform, Normal

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Welcome to Highline BI348 Class Video Number 65. If you want to download this workbook, BI348Chapter11Video065 and follow along, click on the link below the video. Last video we're talking about creating a Monte Carlo simulation. This video we want to talk about creating random or uncertain variables in Excel for, of course, our Monte Carlo simulation. Now we're going to see four different variables based on four different distributions. We'll look at our relative frequency distribution variable, binomial distribution variable, uniform, and normal distribution variable. Now remember, before we can create random variables, we must have a good estimate of what the probability distribution is for that variable. So in our first example, we'll go over two discrete probability distribution sheet. Here is our variable we need to randomize, direct labor cost per unit. Now this is a discrete variable, and from a historical relative frequency distribution-- that means from the accounting department we have estimated these probabilities for these direct labor costs. Now in the process we rounded them off, so we have an actual 39, then a 40, then a 41. So this is a discrete distribution. Actually, last video we had a cost example. And we didn't round it off. And we treated it as a continuous uniform variable. But this one we have a different set of probabilities and it is a discrete variable. So how in the world do we take this, a relative frequency distribution or a probability distribution, and use it to create a random variable? Well, we're first going to look at the RAND function. Last video we looked at RAND between. Here we're going to look at RAND. Both of these functions use a uniform distribution for randomly generating numbers. Now, this uniform distribution for RAND won't work straight out of the box, but we'll do something interesting in the formulas down here to get this probability distribution to show up in our formula. Now what does the RAND function do? I actually want to backspace, because this is an argumentless function. You don't put anything in between the open and the closed parentheses. The RAND function will automatically generate a random number when I Control Enter, between 0 and 1-- 0 is included, 1 is not-- up to 15 digits long. Now I don't have all the digits showing here. You could increase or decrease. But there's 15 digits there. That means like 999 trillion possible values. And it is selecting amongst all those values in accordance with a uniform distribution. And every time I hit the F9 key, you can see the formula is randomizing, selecting a number between 0 and 1. So this by itself is not going to help us with this distribution. But we want to get clever here. We want to convert this into a lookup table and have this RAND as our lookup value. Now let's just hit F9 here, F9 here. 99. One more, 32. If I were to add, let's say, these first three probabilities up, this would be 45. So 32. If I were to use this as a lookup value and have a column with cumulative probabilities, it would say 0.1, 0.25, 0.45. So this would hit the first bigger one and jump back. And it would select that 40. Now I want to go ahead and create this extra column here, because that's the trick for using a historical relative frequency distribution. Now the first value has to be 0, because when you have a VLOOKUP you have to have the smallest possible value. Now this value right here has to be 0.1. This value right here has to be the sum of both of these, which is 0.25. So watch this. I'm going to get tricky. I'm going to use the Sum function with an expandable range. I'm going to click in that cell C5. Colon gives me the second C5. Close parentheses. Put your cursor right in the first cell reference, F4 to lock it. That means the first C5 is locked but the second one is not. Because I'm copying down across rows, that 5 will turn to 6, then 7, successively adding and creating a running total or a cumulative total. Control Enter. And copy it down. To see that expandable range, select any cell and F2. You can see the 5 moved to 6, and the range is expanding. If I come down to the end, boom. That's to 10. That has expanded the whole way. So for the first column of our lookup table, we need our cumulative total. Now we can see explicitly if this is our first column for VLOOKUP or LOOKUP, because we're doing an approximate match. That's the first column. This is the second column with the thing to go and get and bring back to the cell. Notice 0.400 somewhat. It comes racing down, bumps into the first value bigger, jumps back. And it would go and get the 41. Now lookup functions and VLOOKUP with approximate match don't literally search through, bump into the first bigger one, and jump back. That's just a metaphor for understanding how a binary search works. Let's try this. Let's us go ahead. And by all means, if you want to use VLOOKUP and use the RAND as the lookup value, the lookup table as this and then comma 2 for a column index. But we're going to end up doing a lot of formulas for our simulations. And I don't want to have to type the extra column here. So if we know we're doing approximate match lookup, I'm just going to use LOOKUP. And actually we covered this topic extensively back in Chapter 7. So I'm using Equals LOOKUP. And I also have a slide in the PowerPoints, if you download them, about how LOOKUP function works, and how it's different than VLOOKUP. So I'm going to look this up. There's the lookup value. Comma. And the array is the one I want. First column, second column. Now the beautiful thing about LOOKUP is it will always select a value from the last column in the array table. It happens to be two columns, so of course it's going to get that. That's all I have to put. Close parentheses and Control Enter. There's our random value. So what did it do? It took 11%. It raced down. It bumped into the first bigger one, jumped back, and got the 40%. If I hit F9 key to randomize 36, it raced through, bumped into the first bigger one, jumped back, got the 41. I could keep going, randomizing. You can also see a visual over here. So LOOKUP is taking the cumulative probability all the way up to the 42, stopping there. And so it's getting the 42. F9. Here it's getting the 41. I'm hitting F9 a bunch. Now it's getting the 43. Now I'm going to add a little formatting over here. And I followed my convention. The cells that have something typed into them have no fill. The ones that have a formula have the fill. And that's how we can create a random variable based on a custom relative frequency distribution or estimated probabilities. Now when we get to our formulas later, we're not going to do it in two separate cells. We're simply going to go LOOKUP, RAND, argumentless function. Comma. The ARRAY. Highlight first and second column. It will automatically get it from the second column. And Control Enter. Now if I hit F9 here, these two are going to be different. Because that cell is linked to that RAND. And this one has a RAND internally, F9, F9. So I'm randomly selecting based on this frequency distribution. You know, if you didn't trust this and wanted to prove that this formula works, go ahead and we learn how to do data tables last video, do a data table with 10,000 rows based on that formula. And then do a pivot table based on it and build your own probability distribution. And you'll see that the simulated value probability table will be almost exactly equal to this. Now that is our first example. Now let's go over to discrete binomial distribution example. Now we have a situation here. We have number of people who attend a seminar. And that's going to be number of trials for our binomial experiment. So 35 people come. A success is going to be whether or not any one of those attendees at the seminar becomes a new customer. And the probability of success, from our past experience for any one attendee becoming a new customer, is about 6.5%. Now before we can use the binomial distribution, we have to run our four tests for whether or not it's a binomial experiment. Are there are a fixed number of trials? Yes. Only two outcomes for each trial? Yes. With either a new customer or they're not. So we'll say yes. Probability of success stays the same each time? Yes. This will stay the same each time. Each trial is independent? We will assume that that's true for this experiment. If all of these are true, then we can use the binomial distribution. Now we don't have to get fancy with VLOOKUP, because there's a function called BINOM.INVERSE just like there will be one called NORM for normal distribution dot inverse. And this actually will create a value for us based on a probability distribution. If we give it a probability BINOM.INVERSE will spit out a variable. The trick is that the probability we're going to give it is from the RAND function. So every time we hit F9, BINOM.INVERSE will spit out a random variable based on the binomial distribution. Now I'm going to do Equals RAND argumentless function. Equals BINOM. And in our prereq class, we did BINOMDIST a lot. We even saw BINOM.DIST.RANGE. But we actually never did BINOM.INVERSE. But no problem. It's relatively easy to understand. Number of trials, that is our 35. That means we have 35 chances to convert the attendees to new customers. Comma. The probability for each one of those attempts, based on past data, is 0.65. And Alpha is the cumulative probability. So if I put this 0.68 into Alpha, BINOM.INVERSE will think that we're going to up to that particular probability, whatever it is, and it will pick the number of successes-- that's our x variable-- in 35 tries. So when I close parentheses and Control Enter, for this first F9 or running of our experiment, we got a random variable of 1. That means we got one new customer out of 35. If I hit F9, this random time we got four new customers out of 35 tries. F9, F9. Now you can see that this distribution, given 0.065, we can see that up past eight, it's 0 all the way. So we're pretty much stuck getting between about 0 and 7. Now we can put it all together, BINOM.INVERSE. Remember, DIST always gives us the probability if we put in an x. INVERSES always gives us the x, if we put in a probability. And that's true for all of the distribution functions. INVERSE gives us the x value. DIST gives us the probability. So I'm going to use the inverse number of trials. Comma. Probability of success on any one. And our alpha, RAND, argumentless function. Close parentheses. Close parentheses. And there we go. So again, these two are different. That formula right there is linked to that RAND. This one has an internal RAND. So as I hit F9, that is our randomizing formula for a binomial variable. Now we want to go over and look at our third example, continuous uniform probability distribution. Now we actually did example in last video of this one. Here is our variable, material cost per unit. It is a continuous variable. We're assuming uniform distribution, the min and the max value. Now unlike last video, I'm going to show you two different ways. And I kind of like the first way, and this is the way we did it last variable. Equals, well, RAND. And RAND between both use uniform distribution. So I'm using RAND between. The bottom comma and the top, or the min and the max. Now right now it would treat this as a discrete variable, if I F9, F9, and you don't need to worry about this. This is linked below down here. But as I hit the F9 key, F9, it's treating this not as a continuous but as a discrete variable, which is fine if you have that. But we're going to F2, and just like we did last video, we're going to multiply this times 100, multiply this times 100. That will give us 8615. 9233. I need to slide the decimal over to the left two positions, so F2 divided by 100. And that will be our formula. 9858, F9, 105. Wow. So that's an example where we got exactly the max value. Look at that. F9. If you run a simulation, whatever max value you'll have or whatever particular penny value you have, if you did it 10,000 times, each one would be about the same percentage. Now there's an alternative. And before Excel 2010, when this was added automatically, oftentimes we had to do this one. We had to say whatever the min is-- and actually this is a formula they show you in the textbook-- and we have to add to the Min the difference between the Min and Max. And you have to say Max minus Min. Now think about this. What's the difference? $30 plus $75? If I were to enter this right now it would always give us the max value. F2. But what if I multiplied the difference times 0. Then, of course, I'd get the Min value. F2. What if I multiplied the difference between 1. Well, of course I'd get the max value. Well, that should lead us right to our conclusion. We can simply use the RAND function. The RAND delivers a number between 0 and 1. Control Enter. Now I'm going to Control Shift tilde to remove the number formatting. If you start hitting F9, remember that the RAND function has a 15-digit number, so there's 15 digits on every single one of these. So we probably want to round this. We actually want to round just the second part. That's already an integer or a whole number. So round. And then I'm going to come to the end. Comma. And the number of digits, I want pennies, so I do two. Close parentheses. Control Enter. Now these are two different formulas based on a RAND here and a RAND between, so they'll be creating different numbers if I hit F9, F9. F9. So that's how you can do it for a uniform, either one of those formulas. We have one last variable to look at, continuous normal probability distribution. Now our variable we want to randomize is demand for a product, continuous. And it is a normal distribution. That means from our past data we've already determined that demand has a normal distribution. If that's the case, and we want to create a randomizing formula, we need to know the mean and the standard deviation from our past data. And so we do have that. Now again, we're going to use the RAND function, which will create a uniform distribution. But now we want to see, just like for BINOM.INVERSE, there's a norm for normal distribution. And inverse means I will calculate your x value. So you ready? Equals NORM, DIST is for probability, INVERSE is for our x. There's the probability. And like all of our distribution functions, it will calculate from the smallest value-- in our case, negative infinity-- all the way up to the x we're giving it. So that means if we give it a probability, it will assume from negative infinity up to some point. So I can simply click on this, 77%. That's going to be somewhere up in here. Comma. And then we need to give it the mean. Comma. And the standard deviation. And Lo and behold, now we're getting a random variable based on the normal distribution. Now here's a picture, right? That probability of 0.91, it goes all the way up to there. And so the NORM.INVERSE will tell you whatever the x value is. So the demand here would be 24,470. If I hit F9, F9, you can see different probabilities from the RAND function get thrown into NORM.INVERSE and then it spits out the x value. F9. Now if this is demand, F2 we're going to use round. Open parentheses. Comma, 0. Close parentheses. Control Enter. Now if I hit F9, you can see we're getting all sorts of different possible random values for our demand. Now remember the normal distribution, the bulk of probability is right in here. So it will be highly unlikely that we get something down here or up here. And of course, way out here, past three standard deviations in either direction, it's going to be incredibly small, like 0.5% out of the total 100%. So if I hit F9, very rarely are we going to get a value up here. Most of the value is going to be right by the mean. Now another thing to consider, when you're creating a randomizing formula in Excel, is the relationship between mean and standard deviation. If this were 10,000, that means two standard deviations would be below zero, which is perfectly all right for this distribution but not necessarily for demand. So you have to think about your variable. If our demand can never go below zero, then we're going to have to amend the formula. And here's the entire formula. In essence, we're going to have to take whatever this is, or zero, whichever one is the maximum amount. So the full formula would be, hey, please give me the Max. And I'm simply going to put for our number, 0. That's one of the options for the Max function to look at. Comma. Or round NORM.INVERSE. The probability is RAND argumentless function. Comma. The mean, the standard deviation. There it is. We're going to close off. And notice, I'm reading my screen tips, when I'm nesting functions. It's kind of hard to see where you are, except for our screen tip will really help us out. So when I close parentheses I know I'm back in round. I have to get our comma to get to number of digits zero. I see the screen tip. Close parentheses on the round. That whole number 2, there, is our randomizing function. Only when this is below zero will the Max function pick out the zero. So most of the time with this distribution it's clearly going to get mostly positive values. And if we ran 10,000 records-- which we'll do in our next video for a simulation-- we might get some zeros there. Because three standard deviations out happens once in awhile. You ready? So that's our full formula. Now again, these will get two different values, because it has two different RAND functions. I'm going to Control Shift tilde to apply general formatting. And there we go. There is our full formula for demand randomly creating values. So in this video we saw how to use the normal probability distribution and this formula to randomly generate demand numbers. Back on continuous uniform we saw two different formulas, F9, F9, to create random variables for a uniform distribution for a material cost per unit. Background discrete binomial. If I hit F9, F9, we're randomly creating a number of successes for a binomial distribution. And finally, all the way back on discrete probability distribution, these formulas here, LOOKUP and RAND together, randomly creating values from a discrete probability distribution. Next video we'll see a comprehensive new product profit analysis with a bunch of different uncertain random variables all in one simulation. All right. We'll see you next video.
Info
Channel: ExcelIsFun
Views: 14,253
Rating: undefined out of 5
Keywords: Excel, Microsoft Excel, Highline College, Mike Girvin, excelisfun, Michael Girvin, Mike excelisfun Girvin, Excel Magic Tricks, Business Analytics, BI 348, Data Analysis, Create Random Variables in Excel, Relative Frequency Distribution Variable, Binomial Variable, Uniform Distribution Variable, Normal Bell Distribution Variable, RAND function, RANDBETWEEN Function, NORM.INV Function, BINOM.INV Function, LOOKUP Function, Normal Bell Distribution, Excel Monte Carlo Simulation
Id: 7NNtUs5xAkw
Channel Id: undefined
Length: 22min 41sec (1361 seconds)
Published: Thu Dec 24 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.