Normal Approximation to the Binomial Distribution

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
recall that in the last chapter we talked about the binomial distribution in this segment we want to describe certain conditions under which the binomial distribution and problems involving a binomial distribution can be accurately approximated by using the normal distribution and now that we're familiar with using the normal distribution and may turn out that those calculations are a little bit easier than having to go through the calculations for the binomial distribution consider this now remember with the binomial distribution n was the number of trials are the number of successes P the probability of success Q the probability of failure now in in our situation here let us suppose that in the number of trials is 25 the probability of success is 0.25 and we want to find the probability that R is greater than or equal to 8 now from our previous experience there are a couple of ways of doing that one way is to calculate the probability that R is 8 and add the probability that R is 9 and add the probability that R is 10 and so on all the way out to adding the probability that R is 25 another way is to calculate 1 minus the probability that R is less than 8 this would probably be the way that we would decide to do this kind of problem right now okay now it turns out that a problem like this can be approximated using the normal distribution and these are the circumstances under which the approximation becomes accurate and those circumstances are these that when in the number of trials times P the probability of success is greater than 5 and in Q is also greater than 5 we'll use the normal distribution now for this problem NP is 25 times 0.25 or 6.25 NQ is 25 times 0.75 or 18.75 so we meet the criteria then for using the normal distribution to approximate the binomial distribution now here's how we start the process we want to calculate me standard deviation those are very important features involved in using the normal distribution so calculating the mean and standard deviation using information from the binomial distribution means that the mean is NP or 25 times 0.25 6.25 the standard deviation is the square root of NPQ which is 20 the root of 25 times 0.25 times 0.75 or approximately two point one seven now it seemed at this point that that we're thinking okay this is we're outlining our problem here this is the mean and standard deviation we're going to use in the normal distribution that's fine and for the problem here we were talking about the probability that are the number of successes is greater than or equal to eight and it would seem that what we're going to calculate using the normal distribution is an X that's greater than or equal to eight but not quite not quite because let's look at areas involved with the graph of a normal distribution and the graph of the the binomial distribution now remember the graph involved with a binomial distribution was a histogram and here's a histogram corresponding with the the situation we have at hand now this is just part of the graph the graph extends a little bit in this direction and quite a bit in that direction but these are probabilities associated with each of these values in the in the distribution you know six has a probability of roughly 0.18 okay now what we're trying to do because we are thinking in terms of areas is is this areas correspond with probabilities and we're thinking about adding all of these areas and out to the right now when we sort of mentally overlay a normal distribution on top of this binomial distribution it looks about like this now I'm just going to try to hit the the midpoints of the tops of these just to kind of emulate the situation and it would look kind of like kind of like kind of like I'm trying to make this look like a normal something like that well roughly like that but notice that if I'm thinking about area involved with under the this normal distribution do I want area starting here at eight and going out to the right no I don't want just that area I want to also include the remaining portion in this region you see I don't want just half of that histogram I want the whole histogram if I'm going out here to the right of eight so I want to take the the shoulder of this of this histogram I want to go between the one before and the one that I want to be involved in and so I want to use 7.5 and that's the only trick that's the hitch in this this whole get along we have to make sure that we sometimes this will be a thing where we want to contain we want to include a value on one side or another sometimes we want to exclude a value it depends on whether it's greater than less than and and so forth intuitively you'll sort of understand you'll know what that is that if we want to include if we're talking greater than or equal to 8 then we want everything that is involved with 8 in the binomial distribution and that stretches all the way back to 7.5 so that's why we're using our a 7.5 figure in the calculation for the normal distribution okay let's review just a little bit binomial distribution normal distribution under the binomial distribution we had twenty five trials probability of success we're looking for the probability that our successes is greater than or equal to eight we go about the business of calculating mean and standard deviation good we've already done those things then when we jump over to the normal distribution we use the mean and standard deviation that we calculated here but we're finding the ability that our x-value in our normal distribution is greater than or equal to seven point five because we're thinking of these histograms over here for this binomial distribution and we want to capture all of that histogram involving eight and it stretches all the way back to seven point five okay so we want X to be greater than or equal to seven point five and away we go with what we know about the normal distribution that is Z as X minus mu over Sigma so we have seven point five minus six point two five over two point one seven which is 0.58 so a Z then of 0.5 a is what we're talking about we would sketch the situation so we sketch it here so we want the area under the curve that's to the right of 0.5 eight oh how do we do that well we find all of this area 0.5 and we subtract the area up to this point which involves using the table so all the area on the right that's 0.5 the area of zero to 0.58 turns out to be 0.2 one I know from the table subtracting here we get 0.28 one and the probability then that R is greater than or equal to eight from that binomial distribution is approximately 0.28 one let's take a quick look at the calculation using the table in the back of the text which is a left-tail style table the general idea is to take a total area under the curve and then subtract the area which is to the left of 0.5 eight because that's the value that's in the back of the text and then we'll be left with the area which is to the right of 0.5 eight well the area under the curve is one in the area to the left of 0.5 eight from the table is 0.71 nine oh and the difference is point two eight one as expected now the exact value on this turns out to be point two seven three five you see but but we're in within sort of one percent of the of the actual value by using an approximation within the normal distribution pretty neat trick isn't it USA Today reported that eleven percent of all books sold are Rome Oriented if a local bookstore sells 316 books on a given day what is the probability that less than 40 are romances before we even begin to answer the question let's outline some information here we know that we're talking about 316 books being sold on a given day that's what the N value is for our problem and incidentally we're in a section involving a binomial distribution and the idea that the normal distribution can emulate the binomial distribution under certain circumstances so we know we have a binomial probability distribution here we're talking about the probability that a randomly selected book sold at this bookstore is going to be a romance to be 11 percent so the p value is 11 percent or 0.1 one the Q value is easily calculated as 1 minus P 1 minus 0.1 1 is 0.89 so Q is 0.89 and now from here we can calculate a mean and standard deviation remember the mean as in P so it's 3/16 times 0.1 1 or 34.76 the standard deviation is the square root of NPQ which turns out to be this rather lengthy looking decimal and we're going to use this this long winded version of the standard deviation for accuracy purposes that is if we used a rounded version then we're using a version which has more error built into it and if we use a rounded version we are likely to get a rounding error for our answer so we want as much accuracy as we can in making the initial calculations or the various calculations and our problems okay let's find the probability that less than 40 of these books sold are romances and we begin the idea with the binomial distribution associated with this problem now notice that at the bottom these are some numbers of books and the bars above those numbers of books correspond with the probability associated with that number of books being sold now that that number of romances being sold so we we understand that under certain circumstances that is the circumstance that NP is greater than 5 and NQ is greater than 5 under those circumstances a normal distribution will emulate this binomial distribution so we can think about the normal distribution overlaying the binomial distributions now here is the probability associated with exactly 40 romance novels being sold out of this 316 total number of books being sold now we're talking about though the probability that less than 40 are romances so we want to find the area that is to the left of 40 that is we want this area under the curve and we don't really want 40 you see so we'll have to take that out now I'm showing I'm illustrating it this way to illustrate a point we don't want the probability associated with 40 because we want the probability that less than 40 or romances now so what value are we going to use for X when we use our normal distribution to approximate this binomial distribution well we're going to use this value this border between 40 and 39 and that border is thirty nine point five all right so thirty-nine point five is going to be a kind of a magic number here for us in just a moment well here's how the calculation is made we think about the probability that the number of romances is less than 40 is approximated by its approximately see where we're jumping into our normal distribution here it's the probability that an X in the normal distribution is less than thirty nine point five and we'll calculate a Z associated with this thirty nine point five like this and remember the the mean and standard deviation are given above so we pop in annex of 39.5 and make the calculation to find the Z value to be 0.85 now with this point 85 then we understand that our probability then is the probability that the Z is less than 0.85 so we turn to the table and find that a Z corresponding with an area of less than 0.85 is 0.8 o2 3 and this is the probability then that less than 40 of these books that are sold on this day are romances let's calculate the probability that at least 25 are romances we'll start the way we did before with our distributions kind of overlaid our binomial distribution and normal distribution overlaid with one another and here is the position that exactly 25 romances are sold and we want the probability that at least 25 that's 25 or more so we want the area under the curve corresponding with all of this area now notice that since we want to include 25 as part of the region that we're calculating we're going to use the sort of the left-hand border of 25 and that's the border that's between 24 and 25 so our X is going to be 24 point 5 so again we we think of it like this the probability that the number of romances is greater than or equal to 25 is approximately the same as the probability that an X in the normal distribution is greater than or equal to 24 point 5 now here we're calculating the Z associated with the 24.5 and that Z is negative 1 point 8 4 so our probability that X is greater than or equal to 24 point 5 is the probability that Z is greater than or equal to negative 1 point 8 4 but we know that our table contains information about area to the left of Z scores so we're going to think of this as 1 minus the probability that Z is less than negative 1 point eight four and the the Z corresponding with a negative one point eight four is point zero three to nine and one minus point zero three to nine is point nine six seven one for the area and that is the probability associated with at least 25 of those novels being romances now let's calculate the probability that between 25 and 40 are romances including 25 and 40 so between 25 and 40 including 25 and 40 well here is the probability associated with exactly 25 here's the probability associated with exactly 40 and we want all of these products to add all these probabilities between those two well since we're including the 25 and the 40 we're going to use this left border of 25 and the right border for 40 you see as sort of magic numbers when we're talking about our normal distribution so the probability that the number of romances is between 25 and 40 including 25 and 40 is approximately the same as the probability that X is is between twenty four point five and forty point five including those two now we need z-scores associated with those and for 24.5 the calculation is made like this so the Z is negative one point eight four we calculated this earlier and then for 40 point five we replace that X with 40 point five and we find the Z associated with that to be one point zero three and we put that value into the inequality now we may recall that in finding the probability or the area which is between two z values we simply take the difference of the two that is we're going to take the area associated with the larger z-score and subtract the area associated with a small one that's all there is to it now Z of 1.03 implies an area of 0.848 five z of negative one point eight four from the table implies an area of point zero three to nine taking the difference we find the probability or the area to be 0.81 five six last part of the problem says in the solution of this problem what was in p q well we calculated those earlier does it appear that both NP and NQ are larger than five why is this an important consideration well here where the NP and Q that we calculated earlier and if we punch our calculator and calculate NP and NQ we get these values and those values were important recall because they were they allowed us to transition from the binomial distribution to the normal distribution that is these conditions assure us that the normal approximation will be sufficiently close to the binomial probability for most purposes [Music]
Info
Channel: Dana Mosely
Views: 3,605
Rating: undefined out of 5
Keywords: Basic Math, Prealgebra, Algebra 1, Geometry, Algebra 2, Trigonometry, College Algebra, Math, Mathematics
Id: L91TuuA5QME
Channel Id: undefined
Length: 18min 47sec (1127 seconds)
Published: Wed Apr 17 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.