The Normal Approximation of the Binomial Distribution

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so in this video we're going to do the normal approximation to the binomial distribution so there's some adjustments we're going to have to be made because a normal curve is continuous while a binomial distribution is discrete you know each case is you've got a yes or no answer in each case and again here are the characteristics of a binomial distribution just to kind of remind you of all of them um there must be a fixed number of trials so you know we're going to do it 100 times we're going to flip the coin 100 times the outcome of each trial must be independent so you know it's not like drawing cards drawing cards out of a deck without replacement or with replacement or without replacement excuse me drawing cards out of a deck without replacement you know finding the probability of say two queens that would be dependent event because what you drew the first time and what you draw the second time will kind of match you know they they depend on each other but flipping two coins every time you get a 50 50 chance so your outcomes are independent each experiment can have only two outcomes or be reduced to two outcomes so yes or no true or false chance of rain is thirty percent you know those kind of things each day it's maybe each day it's the same the probability of a success must remain the same for each trial right so like and that's i was getting that with with rain so if you had 30 chance of rain for five days in a row you know you've got five trials they're all the same you've got a 30 chance of rain 70 chance of no rain and you can calculate the probability of that okay in addition to those there's a procedure for the for the normal approximation you know there is a so the first thing is we got to check to see whether the normal approximation can be used a general rule of thumb for this is um np which p stands for the probability of the event happening has to be greater than or equal to 5 and n times q has to be greater than or equal to 5. anything less than that so basically the means of your probability distribution this would be the mean that is true or it's going to happen and this would be the mean that it's not true or false it's false i love that so the mean that it's false would be this one and so they've got to both be bigger than five in order to be considered a normal approximation okay and so here's the rest of the steps and we'll come back to these then you find the mean and standard deviation which hopefully you've already done in other other assignments write the problem in probability notation using x we'll do that rewrite the problem by using the continuity correction factor so the continuity correction factor and is the next thing we need to kind of look at and so it's this table here and we're going to choose in order to approximate it normally i call this a fudge factor but that's probably incorrect but if probability of x is equal to some number then you you're going to subtract 0.5 from it and add 0.5 from it and find find the probabilities between those and then greater than or equal to a i just couldn't find the symbol on my program so that would be that one this would be the probability of x is less than or equal to a and so there's there's your continuity correction for continuity so let's look at a problem just kind of dive into this thing so the first thing if a baseball player's batting average is a 0.32 find the probability that the player will get at most 26 hits and 100 times a bat so we're going to assume that the batter never gets better his probability is always the same he's got a 0.32 probability so the first thing you do is see if make a test to see if uh if we can even use the normal approximation so we're going to find p n and q n okay we know that p is 0.32 and we know that q then would be 0.68 if you subtract 1 from that and so p n would be 0.32 times 100 and q n would be 0.68 times 100 and so definitely we end up with 32 and 68 for np and nq so that we can indeed use the probability approximation the normal um the normal approximation so the next thing we've got to do is find the mean and the standard deviation well we've already found the mean that's this one on the right that's the mean mu equals that's the mean of getting a hit okay and the standard deviation standard deviation can be found in a binomial distribution as n p q so the square root of npq and so we would go ahead and calculate that the square root of npq is 0.32 times 0.68 times 100 and so you you do all that math and you get 4.66 okay so we'll keep those numbers in mind so we found that oops and then the third thing is write the problem in probability notation using x so that we can use the correction continuity table here so it says and the key to this is right here find the probability that the player will get at most 26 hits okay at most 26 hits would look like this the probability that you have hits less than or equal to 26. so at most means you could have 26 you could have 25 24 23 and on down so with this we can go ahead so the probability of x is less than or equal to 26 we can go ahead and use our correction for continuity so we are using this one right here so we are going to add a correction factor of 0.5 to my value so going back so my probability is actually going to be less than or equal to 26.5 so the next thing we got to do is calculate the z-score for that so we are at right here uh there oh look for i rewrote the problem with the correction factor involved now we're going to find the corresponding z-score and then use the table to the z-score table to find the solution so we go down here to the bottom and let's find the z-score we know the formula for z-score excuse me pardon me whoa is x minus your value minus mu divided by your standard deviation so mean divided by standard deviation so in our case it'd be 20 our z-score is 26.5 minus our mean of 32 divided by our standard deviation of 4.66 and we get a value of negative 1.18 okay now i always draw a picture of this situation what is happening here so our normal distribution looks something like this there's the bell curve our mean is 32 we're looking for 26.5 which is here maybe i didn't establish a scale and we're looking for the probability of my batting it my hits to occur in this level anything less than twenty six and a half hits okay so we've got the z score for that the corresponding z score for 26.5 is negative 1.18 so we'll go to the z score table you could also plug it in online if you wanted to and because there's lots of probability calculators out there and they're free but i like the old-fashioned way with the table because that's what how i grew up and so um i've already forgotten what the number is sorry 1 negative 1.18 okay so negative 1.18 well we don't have negative values but here's 1.1 and then you go over to 8. i'll follow it around there it is 0.3810 so that's 1.18 so 0.3180 so my corresponding z value my probability again let me redraw the picture so we can kind of have an idea of where we're at we're here 26.5 1.18 negative zero or the mean and we want everything to the left now my z-score table this is kind of important so let me go to the bottom of this thing so we can kind of see what's happening here i'll extend the page but this right here at the bottom is very important on a z-score table you can see the year 1973 i was born one year after that just to give you my age but we want everything between zero this table gives us everything between zero and the mean and the standard deviation the standard score we're looking for so in our case the z-score value i ended up with which is 0.3810.3810 um is all of this so 0.3810 i'll write that down before i forget it so it's all this in blue we don't want that we want all of this in red so what you have to do is you have to realize that all the data is contained within the entire curve well half the data is located on each side so since we're just dealing with the left side of the graph all we have to do to find out that red tail there is to take 0.5 minus 0.3810 and we end up with 0.119 so what this means and this is important you want to summarize what actually happened here these are real problems with real solutions you know just boxing this in is not good enough we need to say what happened so if you gave this to somebody else they'd be able to read it and say oh yeah that makes sense so in our case the answer looks something like this the probability the probability that the player we'll have at most 26 hits is eleven point nine percent so you know you think about it you know most of the time they're going to have more than 26 hits so that's in 100 that bad so that's pretty cool all right so i hope this helps best of luck there's several other situations i didn't cover in this video you could have had where it's uh just less than a instead of less than or equal to and in that case you subtract a 0.5 you could have where it's exactly equal in that case you would subtract 0.5 and add 0.5 and find out what's in between them so just to kind of summarize if here's my mean and here's a we'd be looking for everything between these two okay if this one here's my mean my standard deviation i'll just do everything on the right and we want everything bigger than a here's a we're going to subtract 0.5 which would put us over here and we'd want everything to the right on this one here um if when we're looking for anything bigger than our a and here's a we would add 0.5 so shift everything to the right a little bit and then it's to the right side we already did this one and then the last one if we have our mean or our standard dev or our normal curve and we're looking for anything less than a here's mu here's a we're going to subtract 0.5 so puts us over here a little ways and then we're looking for everything to the left of that so that's kind of the situation that's occurring in each case and if i have time i'll make another video but with a track meet tomorrow it's probably not going to happen best of luck see you next time good luck on the assignment
Info
Channel: searching4math
Views: 271,402
Rating: undefined out of 5
Keywords: statistics, binomial, distribution, approximation, normal
Id: k9nRcadQYsU
Channel Id: undefined
Length: 14min 57sec (897 seconds)
Published: Wed Mar 27 2013
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.