An Introduction to the Binomial Distribution

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Let's look at an introduction to the binomial distribution, a very important discrete probability distribution. I'm going to assume that you know the combinations formula, also known as the binomial coefficient, because it plays a role in the binomial distribution. A coin is flipped 100 times. What is the probability heads comes up at least 60 times? We'll see that we'll be able to use the binomial distribution to calculate this probability. Here's another example. You buy certain type lottery ticket once a week for four weeks. What is the probability you win a cash prize exactly twice? Well, this might sound a little different from the tossing coins example, but from a probability perspective, it's really the same type of scenario. And we can again use the binomial distribution to calculate this probability. Here's how the binomial distribution arises. The number of successes in n independent Bernoulli trials has a binomial distribution. I look at the Bernoulli distribution in another video, and other distributions like the binomial and geometric are built on a base of independent Bernoulli trials. But let's break down what independent Bernoulli trials means. Suppose there are n independent trials, n is a fixed number, and by independent trials we mean that knowing the outcome of one trial gives us no information about the outcome on another trial. and each trial can result in one of two possible mutually exclusive outcomes and we'll label those two possibilities as success and failure. A success simply means the event that we are counting up and it may very well be a bad thing, like having a tire blowout. The probability of success on any one trial is p, and this stays constant from trial to trial. The probability of success on any trial is p. Success and failure are mutually exclusive and exhaustive, so failure is simply the complement of success, and the probability of failure is simply 1 minus the probability of success, or 1-p. The random variable X represents a count of the number of successes in those n trials. Then X has a binomial distribution, with this probability mass function. The probability the random variable X takes on the value little x, which I'll sometimes write as p(x), is the probability we get little x successes in n trials. That is the probability of success, p, raised to the number of successes x, times (1-p), the probability of failure, raised to the number of failures, which is n-x. We multiply that by the number of ways we can get x successes in in trials, which is n choose x. But I'm going to break that down a little more detail in a moment. It's not a probability distribution until we list what values X can take on, and here we are counting up the number of successes in n trials. X can take on only whole number values between 0 and n. The mean or expectation of a binomial random variable X is equal to np. We can show that mathematically without too much difficulty using the formula for the mean of a discrete random variable that we've discussed earlier, and a little algebra. But I hope this also makes a little bit of sense. If say ,we had a probability of success on an individual trial of 0.2, and we had 100 trials, then on average we'd get 20 successes. The variance of a binomial random variable, sigma squared, is equal to np(1-p). Again, we can show this using the formula for the variance of the discrete random variable and a little algebra. And I worked through the derivations of the mean and variance in another video. Let's look at a simple example. A balanced six-sided die is rolled three times. What is the probability a 5 comes up exactly twice? You might think this doesn't have a binomial distribution, since there are six possibilities on any given roll, any given trial. And that might be a problem if the question were different. For example, if the question were: what is the probability a 6 comes up once, a 4 comes up once, a 3 comes up once, and no other numbers are rolled? Well, this is not a binomial situation, and we could not answer it with the binomial distribution. We'd have to use a distribution called the multinomial distribution. But here that's not what the question is asking. We're simply asking about the number of fives. So we will define a success as rolling a 5, and a failure as rolling anything but a 5. A failure is simply everything that is not a success. So even though it might not have looked like a binomial to start off, it reduces to two outcomes on any individual trial. And if we let X represent the number of fives in three rolls, than X has a binomial distribution with n=3, and p=1/6. The probability of success, 1/6, stays constant from trial to trial. And knowing what happens on one roll gives us no information on what will happen on another roll, and so the trials are also independent. So the conditions of the binomial distribution are satisfied here, and to find this probability, we can put this into our binomial formula. The probability the random variable X takes on the value 2 is equal to, our n is 3 and we need 2 successes, so 3 choose 2, times p, which is 1/6, the probability of success, the probability of getting a 5, raised to the number of successes, which is 2. We multiply that by the probability of failure, 1-p, or 1-1/6, raised to the number of failures, which is 3-2. And if we do this calculation we'd see that this works out 0.0694, when rounded to four decimal places. So that's our probability of getting exactly 2 5's if we roll a fair die three times. Let's look at the binomial formula in greater detail. Why does this formula work? p raised to the x times (1-p) raised to the n-x is the probability of one specific ordering of x successes and n-x failures. But we don't care about the ordering, we just care about the total number of successes, so we need to multiply that by the number of possible ways of getting x successes and n-x failures, and that is simply n choose x. Let's take a bit of a closer look at that. Let's look at the die rolling situation where n was 3. On the first roll we can get either a success or a failure, and then on the second roll we can get either a success or a failure, and on each of those four possibilities, on the third roll we can either get a success or a failure, so there are 8 possibilities here. This one on top gives us success, success, success, so I'm going to write SSS, and this one down here say, gives us failure, failure, success, so I would write failure, failure, success. Let's fill in the rest of those. Now suppose we wanted to find the probability, as we did in the die example, of exactly two successes in three trials, or in other words the probability our random variable X, representing the number of successes, takes on the value 2. Well, if we look at these possibilities, two successes occurs three times. The individual trials are independent, so to find the probability of a single one of the sequences, we simply multiply the probabilities on the individual trials together. Success has a probability of p, and failure has a probability of 1-p, so each one of those sequences individually has a probability of p^2 times (1-p) since success happens twice and failure happens once. To get the overall probability, we need to multiply the probability of a specific ordering of x successes and n-x failures by the number of ways x successes happens. So here, the probability of getting exactly two successes. is going to be equal to 3 times p^2 times (1-p), and I'll just put that to the first power, indicating failure happened once. But if we worked out the binomial coefficient here, 3 choose 2, we'd see that that is equal to 3. n choose x gives us, in the general setting, the number of different ways of getting x successes in n trials. And so in the general case the probability mass function for the binomial distribution, the probability the random variable X takes on the value little x, is equal to n choose x, times p raised to the number successes, x, times (1-p) raised to the number of failures, n-x. Let's look at another example. According to Statistics Canada life tables, The probability a randomly selected ninety-year-old Canadian male survives for at least another year is approximately 0.82. If twenty ninety-year-old Canadian males are randomly selected, what is the probability exactly 18 survive for another year? Well, it might appear to be a bit more complicated situation than rolling dice, but the premise is essentially the same. Here we will call a success the man surviving for at least a year, because we're counting up the number that survive. And a failure is that they die within the next year. We have a fixed number of men, 20 of them, and since we're randomly selecting the men it's pretty reasonable to consider them independent. Although conceivably there could be a wacky situation like a serial killer focussed on killing ninety-year-old Canadian males, but that's probably not something we need to worry too too much about. So here the binomial distribution is reasonable. We'll let X represent the number of men that survive for at least one more year. Then X is going to have a binomial distribution with an n of 20 and a p of 0.82. and you might see that written as X is distributed, or has the binomial distribution, with an n of 20, and a p of 0.82. We want the probability that X takes on the value 18. We're going to use our binomial probability mass function to calculate that. That's simply going to be equal our n of 20, choose 18, times the probability of success, 0.82, raised to the number of successes, 18, times 1 minus the probability of success , 1-0.82, raised to the number of failures, or 2-18. And this, rounded to three decimal places, works out to 0.173. So the probability that exactly 18 of these randomly selected men survive for at least a year is 0.173. If we worked out the probabilities for all the possible values of X, and plotted it, it would look like this. This is the binomial distribution when n is 20, and p is 0.82. And the probability we just worked out, the probability that X takes on the value 18, is over here. Right here was the value 0.173 that we just worked out. If we wanted the mean number of men would survive, the mean of this probability distribution, that would be mu, for the binomial distribution is np, and in this particular case that's 20 times 0.82, and that works out to 16.4. So on average, 16.4 of the twenty randomly selected ninety-year-old Canadian males would survive for at least one more year. The variance of a binomial distribution is np(1-p). And here that's 20 times 0.82 times (1-0.82) And that works out to 2.952. And the standard deviation is simply going to be the square root of 2.952. Suppose we have a different type of question. In the same setting what is the probability that at least 18 men survive for another year? We need the probability of at least 18, so let's shade those values in. We're going to need to add the probabilities of 18, 19, and 20 together, We're going to have to use the binomial formula three times, on each one of those values individually. So let's go ahead and do that. The probability the random variable X takes on a value that is at least 18 is the probability equals 18 plus the probability it equals 19 plus the probability it equals 20. n was 20, so X can't possibly take on a value greater than that. And to find each one of those, we simply use the binomial formula with n=20 and p=0.82, the values given in the problem. So this is the probability of 18, this is the probability of 19, and down here is the probability of 20 And you can verify for yourself that these work out to these values given here. I'm rounding to 3 decimal places here, but you should carry many decimal places throughout your calculations. If we are add those up, we get our final answer of 0.275, which again is the correct answer rounded to three decimal places. The probability that at least 18 of the 20 men survive for at least another year is 0.275. The binomial distribution is an extremely important discrete probability distribution, one that has many applications in probability calculations and statistical inference, so we'll deal with the binomial distribution very frequently in probability and statistics.
Info
Channel: jbstatistics
Views: 562,524
Rating: undefined out of 5
Keywords: binomial, binomial distribution, probability, discrete probability distributions, binomial example, probability examples, random variables, discrete random variables, jbstatistics, jb statistics, statistics, introductory statistics, 8msl, 8 minute stats lectures, intro stats videos, intro stats help, stats help, stats tutor
Id: qIzC1-9PwQo
Channel Id: undefined
Length: 14min 10sec (850 seconds)
Published: Sat Oct 26 2013
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.