An Introduction to the Geometric Distribution

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
let's look at an introduction to the geometric distribution a discrete probability distribution let's open with an example that we'll come back to later in a large population of adults 30 percent of received CPR training if adults from this population are randomly selected what is the probability that the sixth person sampled is the first that has received CPR training we can use the geometric distribution to calculate this probability the geometric distribution is the distribution of the number of trials needed to get the first success in repeated independent Bernoulli trials the distribution of the number of trials needed to get the first success can be important to us at times for example if we're trying to find someone who can perform CPR let's look in greater detail at what is meant by repeated independent Bernoulli trials suppose there are independent trials and each trial results in one of two possible mutually exclusive outcomes and we're going to label those outcomes as success and failure on any individual trial the probability of success is P and this stays constant from trial to trial failure is the complement of success so the probability of failure on any individual trial is just 1 minus P we're going to let the random variable X represent the number of trials needed to get the first success in other words the trial number on which the first success occurs the geometric distribution can be defined in different ways and some sources define the random variable to be the number of failures before getting the first success which is a little different but we'll use this definition now let's work out the probability mass function for the geometric distribution in order for the first success to occur in the X trial we need two things to occur first we need the first X minus 1 trials to be failures and then we need the X trial to be a success the probability that the ex trial is a success is P the probability of success on any individual trial and since the trials are independent the probability that the first X minus 1 trials are all failures is the probability of failure on any individual trial 1 minus P raised to the X minus 1 and since the trials are independent the probability that the random variable X takes on the value little X is the product of these two probabilities 1 minus P raised to the X minus 1 times P this is the probability mass function for the geometric distribution what values can the random variable X take on we could get a success on the first trial so X could equal 1 we could get a failure and then a success so X could equal 2 and so on there is no upper bound on the values X can take on the geometric distribution has a single parameter the probability of success P the mean and variance are functions of P the mean of the geometric distribution is 1 over P and the variance is 1 minus P over P squared now let's look at how we might use the geometric distribution to answer a probability problem let's return to the example in a large population of adults 30% have received CPR training if adults from this population are randomly selected what is the probability that the sixth person sampled is the first that has received CPR training here we're interested in the number of people required to get the first with CPR training in other words the number of trials needed to get the first success since we are randomly sampling from a large population it's reasonable to think that the trials are independent knowing whether one person has or has not received CPR training doesn't tell us anything about whether the next randomly selected person has it so this is a geometric distribution problem we have repeated independent trials we are interested in the number of trials required to get the first success the first person with CPR training and the probability of success is constant on any individual trial P is equal to 0.3 or 30% here's the probability mass function we want the probability that the random variable X takes on the value 6 the probability the first person with CPR occurs on the sixth person sampled this equals 1 minus P point 3 raised to the fifth power times 0.3 we need the first 5 people to not have CPR training and then the sixth to have it this works out to 0.05 0 4 when rounded to 4 decimal places we could work out the probabilities for different values of X and plot them let's do that and see what that looks like here's a plot of the probabilities for various values of X we just worked out this probability here the probability that X takes on the value 6 let's note a few things about this distribution first the minimum value X can take on is 1 but there is no upper bound I truncated it here at 15 for plotting purposes but there is no maximum value the mode of the geometric distribution the most likely value is always 1 and then these probabilities decrease and so the geometric distribution always has this type of shape showing right skewness the mean of the geometric distribution is 1 over P in this case that's 1 over 0.3 which is 3.3 repeating that's right about here on this distribution the variance of the geometric distribution is 1 minus P over P squared here that's 1 minus 0.3 over 0.3 squared which works out to 7 point 7 repeating the standard deviation is the square root of that the square root of 7 point 7 repeating which is 2 point 7 9 when rounded to two decimal places now suppose we want a different probability the probability that the first person trained in CPR occurs on or before the third person sampled we need to find the probability that the random variable X takes on a value that is less than or equal to three let's see what that looks like on a plot here's the distribution with the parts shaded in red representing the probabilities we need to find we need to find these three probabilities and add them together recall that p equals zero point three the probability of success on any individual trial is 0.3 and the probability of failure is 0.7 the probability the first success occurs on the first trial is 0.7 raised to the 0 times 0.3 we get a success on the very first trial the probability the first success occurs on the second trial well we are going to need the first trial to be a failure so 0.7 raised to the first power times 0.3 and for the first success to occur on the third trial we need two failures 0.7 squared times the probability of success 0.3 and this all works out to 0.65 7 that's a correct way of going about answering this question but it would become cumbersome if we had to calculate a large number of probabilities fortunately for the geometric distribution there is an easier way let's develop the cumulative distribution function for the geometric distribution using some logic and basic probability rules the probability that X takes on a value greater than 3 is simply the probability of getting 3 failures in a row which is 0.7 cubed this isn't based on the geometric probability mass function it's based on basic probability rules and a little logic if the first three try our failures then X must take on a value greater than 3 the event that X takes on a value less than or equal to 3 is just the complement of this so the probability that X takes on a value less than or equal to 3 is simply 1 minus 0.7 cubed which works out to 0.65 7 so we could have saved ourselves some time by doing the calculations that way we just worked out using some basic logic and probability rules the cumulative distribution function for the geometric distribution let's write it out a little more formally the cumulative distribution function often represented by capital f of X is the probability that the random variable X takes on a value that is less than or equal to little X for the geometric distribution this works out to 1 minus 1 minus P raised to the X and this is true for whole number values of X that are at least 1 we came up with the cumulative distribution function on the previous slide using a little logic we could show it more formally using a mathematical argument based on a geometric series but I'm not going to do that here and that's a brief introduction to the geometric distribution
Info
Channel: jbstatistics
Views: 235,052
Rating: undefined out of 5
Keywords: Geometric distribution, geometric, Discrete random variables, discrete, random, variables, discrete probability distributions, introductory statistics, introductory, statistics, jbstatistics, jb statistics, intro stats videos, intro stats help, stats help, stats tutor, jeremy balka
Id: zq9Oz82iHf0
Channel Id: undefined
Length: 10min 47sec (647 seconds)
Published: Thu Jan 30 2014
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.