Statistics 101: The Binomial Distribution

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome to the next video in my series on basic statistics and for this video finite mathematics the topic of this video is taught in both stats courses and finite math courses so you may be watching this and a stats playlist or in a finite math playlist but either way it does apply to both now a few things before we get started number one if you're watching this video because you are currently having problems in a class I want you to stay positive and keep your head up if you're watching this it means you have come pretty far in your education up at this point and you may have just hit a rough patch and I know that with the right amount of hard work patience and practice you can get through it I have faith in you and so should you number two please feel free to follow me here on YouTube and or on Twitter that way when I upload a new video you know about it and on the topic of the video if you liked it please give it a thumbs up that does encourage me to keep making them if I know individuals find them beneficial I want to keep doing them on the flip side if you think there is something I can do better please leave a constructive comment below the video and I will try to incorporate that in the future videos and finally just keep in mind that these videos are geared towards individuals who are relatively new to stats or finite man for that matter so I will just be going over the basic concepts very slowly and very deliberately I want you to know what's going on but I also want you to know why it's going on so all that being said let's go ahead and get started this video is the next in a series on probability distributions so in previous videos I talked about discrete probability distributions in great detail we looked at coin flips we looked at die rolls among other things so the binomial distribution is a specific type of discrete probability distributions and remember a discrete distribution just means that the numbers or the outcomes can only have a finite number of outcomes and they are almost entirely whole numbers like 1 2 or 6 or whatever they're not going to be like 47.89 5 so discrete probability distribution means that each outcome is usually a whole number and it also means there are a finite number of outcomes or a infinite sequence of numbers like 1 2 3 4 5 but again that's in a previous video I don't want to go into all that right now so if you'd like to have some sort of refresher on discrete probability distributions go back and look at those videos now to understand the binomial distribution we're going to be using an example and this example is about the performance of salespersons we managed so we're going to get at the binomial distribution by looking at how to sales people on our floor do in terms of closing sales and their productivity and how many sales calls they make per day so we're going to get at this concept through a problem so let's go ahead and take a look at that problem so here's our example you are a sales manager and of course as the sales manager you analyze the sales records for all sales persons under your guidance so you look at how many sales calls each person makes you look at how many successful sales they make how many closes they make relative to their calls and you look at their productivity how many calls they make per day and things of that nature so right now you're going to be looking at two individuals on your sales floor Joan has a success rate of 75% so if she calls someone during the day and she makes her sales pitch she closes that sale 75% of the time that's really good and she averages 10 sales calls per day because these are not just quick one-minute sales calls these are more involved sales calls that your employees make but she makes 10 per day and she closes about 75% of them now Margo has a success rate of only 45% so much lower but she averages 16 calls per day so different success rates and different average call counts per day so here's the question what is the probability that each salesperson makes exactly 6 sales on any given day so what is the probability that Joan closes 6 sales on any given day given her success rate and average number of calls same thing for Margo what's the probability that she makes six successful calls per day given a rate of 45 percent and 16 calls per day so that's maybe the question we answer at the end of this video using the concept of the binomial distribution now we can begin understanding the binomial distribution by looking at the binomial experiment and this is a specific kind of experiment that has very specific instructions or details to it or conditions let's go ahead and look at this so a binomial experiment has the following characteristics number 1 the process consists of a sequence of n trials so we're going to call these iterations so if we have a sequence that is 6 trials we have the first iteration the second iteration the third fourth fifth and six so six trials and that number of trials is fixed so whatever our n trials is number two the process only has two exclusive outcomes one is called a success and the other is called a failure much like a coin flip can only have heads or tails this binomial experiment can have a success or a failure and those are mutually exclusive now there is an asterisk to this condition which we'll talk about at the end number three the probability of success is denoted by a lowercase P and that probability does not change from trial to trial now the probability of failure is 1 minus P because remember the probabilities have to add up to 1 so it's success is P then failure is 1 minus P or 1 minus that's success and it's also fixed from trial trial so this fixed nature of the probabilities is very important because think of a coin flip the probability of heads is 0.5 the probability of tails is 0.5 do those probabilities change from flip to flip well no they remain fixed and so is the case in the binomial experiment of course they won't always be 0.5 and 0.5 there might be 0.25 and 0.75 or 0.1 point 9 but they do remain fixed and finally the trials are independent the outcome of a previous trial does not influence the next trial or future trials so again think about a coin flip if I flip a coin and tails comes up does the fact that tails came up that time influence what comes up next time well no it doesn't that's because each coin flip is independent same thing for this binomial experiment so this is very similar in a way to coin flips because on a coin the probabilities are fixed there are only two of them and they are independent and of course in the binomial process we have a sequence of flips or sequence of trials now here's the asterisk you look at number two up there we have one outcome and success one is failure but success and failure can be counterintuitive if you take them literally a success doesn't always mean it's a positive outcome a success is really whatever characteristic we're looking for and the failure is the opposite of that so don't think of success in failure is good and bad it doesn't always work that way just look at an experiment involving five trials so our N equals five our interest is in the number of successes in in trials so what we do is we establish a discrete random variable X to represent the number of successes in our trials let's think about this we have an experiment that has five trials we're interested in the number of successes in those five trials so we could have zero successes we could have one success to success three successes and so on and of course the opposite is the failure so if we have zero successes that means all five had to be failures if we had three successes that means the other two had to be failures so you can kind of see how this works so the number of successes can take any number 0 1 2 3 4 or 5 we could have 0 successes all the way up to all five being successes in these 5 trials just look at one version of the experiment so we have different iteration so one two three four five and at the first iteration we have a success then the next one we have a failure and then we have a failure again then we have a success then we have a failure this is a lot like a coin flip being tails heads heads tails heads or whatever it might be so success failure failure success failure so we have two successes and three failures and that experiment let's look at another one and this one we have success success failure success success and those are our five trials so in this case we have four successes and one failure so you can kind of see how this system of trials works five trials because our n is five each outcome can either be success or failure and they are independent so let's look at the case of four successes in one failure now think about this is that failure can occur at any of the five iterations to look at this in this case the failure is in the third iteration in the next row the failure is in the first iteration in the next one it's in the fourth in the bottom second from the bottom is in the second and then it's in the last so the failure can occur in any of the five spot to the trial which means us the successes can't occur in any of the other four so remember we're interested in the successes so how can we represent this there are we can look at this chart and say well there are five there are five different ways we can have four successes in one failure so how do we do that in mathematical notation well 5 choose 4 so if you remember our combinations video this is where it comes into play so what we're saying here is there are five trials we are interested in the four successes but those four successes can be in any different five you know combinations in the top one the successes are in one and two and four and five in the second row the successes are two three four five in the third one there one two three and five etc so there are five different ways to have four successes in these five trials so 5 choose 4 equals 5 and we can actually see them there in that chart so let's go ahead and fill out our chart this is for every combination of success and failure so in the first column we have in which is our number of trials that's always 5 in the second column we have the number of successes so 0 1 2 3 4 and 5 now the number of failures is the difference so if we have zero successes we have 5 failures if we have 1 success we have 4 failures if we have 2 successes 3 failures and so on now in a combination column remember we're interested in successes so in the top column we have 5 choose 0 how many ways can we do that well 1 because remember 5 choose 0 that's 0 successes so that means all five are failures well there's only one way to do that failure failure failure failure failure now successes about 1 and failures for well that's 5 choose 1 because we're interested in the successes and that is 5 let's skip down to 3 successes in 2 failures well that is 5 choose 3 because those three successes can be you in the chart in 10 different ways so this is just simple combinations where the first number is the number of trials and the second number is the number of successes 5 2 0 5 choose 1 5 choose 2 etc so the totals are 1 5 10 10 5 and 1 so let's take this further and look at another problem okay and this will get towards or final problem so a very poor a very bad one a manufacturer is making a product with a 20% defect rate so 20% of the product being produced is defective that's bad now if we select 5 randomly chosen items at the end of the assembly line what is the probability of having one defective item in our sample so when I run this type of problem past my students at first they blurt out with an answer they say well let's see one two five is 20% so the probabilities 20% I'm like no it's not it is not one-fifth or 0.2 or 20% because remember each time we take something off the in the assembly line that is an individual trial it's either defective or it's not then we take a second item it's either defective or it's not then we take a third item it's either defective or it's not all the way up to our five items so note this is where success and failure are counterintuitive in this example a success is a defective product because that's what we're interested in so think of success as the item or characteristic of interest more than success because in this case the success is a defective product so the number of trials are five because we're selecting five items at the end of the assembly line maybe like every hour throughout the day or something and the number of successes which is a defective product is one so what is the probability of having one defective item in our sample so as we talked about in the previous slides this is 5 choose 1 so think about this time actually I want you to think about doing it so the first hour we go up to the assembly line and we pick a product off the end of this imbue line well it's either defective or it's not let's say it is defective well that would be our one defective product and then to meet this criteria the next four would be not defective or we can select one off the end of the line and it's not defective but then the second one is defective or we could do it again and the third one is the defective one or we could do it again and the fourth one is the defective one of course again and the fifth one is the defective one that's why there are five different ways to do that that's where we get our 5 choose 1 so this reminder a success is a defective product so here is how that could work now remember a failure is a good product and a success is a bad product so we could have these 5 outcomes our defective one could be the third one our defective one could be the first our defective one could be the fourth one we pick our defective one could be the second one or it could be the last that's why we have the 5 choose one so let's put in some numbers here remember our probability of being defective which is actually a success is 0.2 therefore the probability of being a good product is 1 minus P I'm sorry that should say 1 minus B is 0.8 so in the places of the s and the F we just put in those probabilities so this is exactly like the last chart but we take the esses and we put a point 2 we take the F's for failures and we put point 8 now remember to find the probability of this occurring of this trial sequence we just multiply the probabilities together so let's look at the first row of this outcome so we go up to the assembly line and the product is either going to be a success successful or failure so we pick up the product and it's good the probability of that happening was 0.8 then an hour later we grow another one the probability of that being a bad product is point two the probability of being a good product is point eight so it's good so that's point eight then we pick up the third one well that's defective the probability of that have an English point two then we pick the fourth that's okay so point eight the fifth ones okay point eight now the probability of that sequence and our trials of happening is 0.8 times 0.8 times 0.2 times 0.8 times 0.8 and then we add those up or multiply those together I'm sorry and we have Oh point zero eight 192 so that's a very specific outcome we have failure failure success failure failure the product the probability of that happening is 0.08 192 now look at the second row while we've done here is rearrange the numbers and this one the success or the defective product comes first but the multiplication doesn't change it's still 0.2 and four point eight so we multiply that together and get the same thing point oh eight one nine two look at the third row in this case the defective one is the fourth one but again the math doesn't change it's still four point eight and 0.2 that is 0.08 192 so you can see the pattern here just because we we arranged the numbers differently in each outcome doesn't mean the probability changes and that's important so we have five choose one over on the left cuz we have five different outcomes for this one defective product and each outcome has its own probability which happen to be the same because the numbers are the same so what we're getting at is a general formula if you look on the lower left you can see this crazy-looking general formula and that is just the formula without the numbers in it so don't freak out so remember n is our trials X is our number of successes and P is our probability of success so we can just substitute everything in so what we have here is five is the N that's our number of trials one is the number of successes we're interested in so we're interested in one success those are the point twos they're in our chart now of course there's five ways to do that so 5 choose 1 is 5 that accounts for all 5 rows in our chart up there but then in each row it's the same thing so we have 0.2 and then we have 4 0.8 so we can just do that using exponents so 5 choose 1 times 0.2 to the first times 0.8 to the fourth again that's just simple multiplication but we use exponents to make it easier to write now when we go ahead and do that we have 5 times 0.08 192 which equals 0.4 0.6 and again that's the same thing as adding up the probability column over there so what can we conclude well in this case the probability of selecting five products randomly okay in a sequence so we walk up and pick one then we come back and pick one again and we come back and pick one again etc five times the probability of having one defective product in that sequence or in that experiment is 0.4 0.9 six or almost 41% see so if I walk up to the assembly line five different times during the day I pick a product and I do that five times the probability of having one defective out of that five is 0.4 0 9 6 C and here's where it's you got to think differently the error the defect rate is 20% so you would think that you know there would be 1 in there and that would be sort of 20% but because these are trials so each trial is a choice that's why it's point 4 0 9 6 you got to think of each trial as its own thing so here our general formula which were D covered two ends number of trials exit the number of successes and P is a probability of successes in any given trial so in that previous example we had N equals 5 number of successes was 1 and our probability of success which was a defective product was 0.2 so which when plug that into our formula and we came up with point 4 is 0 9 6 or about 41% the probability of one defective product in 5 trials of selection is about 41% so now what if we do this for each possible number of successes so remember we can have 0 successes or 0 defective products we could have one success or 1 defective product when we do that experiment 2 3 4 or 5 so all we do is change the numbers in that formula so you can see in the calculation column we have 5 2 0 and 5 choose 1 5 choose 2 because that's the number of successes and then the exponent above 0.2 changes because that's P that's our number of successes so as we have one success the exponents one two successes the exponents 2 etc now with the failures we have 0.8 that's the probability of failure and then the exponent changes to match that notice that the exponents always add up to the number of trials so 0 plus 5 is 5 1 plus 4 is 5 2 plus 3 is 5 etc so we just have the combination part then we have the success part and then the failure part this formula and then from there it's just math so we have the probability of 0 successes which is the same thing the probability of having no defective products is 0.32 7 6 8 then we move on down the probability of having one defective product or one success is 0.40 9 6 as what we found out in the previous one then we can go down to defective products 0.2 0 4 8 3 defectives in that 5 point 0 5 1 2 4 defectives point 0 0 6 4 and then finally the probability of all five of our selections being defective is point 0 0 0 3 to a very small probability and that should make sense even if we have a defect rate of 20% the probability of going up to the end of the assembly line at 5 different points during the day and selecting a product and all those products being defective is very low point 0 0 3 2 now if you notice if we add up that probability column was to add up to 1 it adds up to 1 just look at this in a chart so here is our quality control I remember our defect rate was 0.2 so that's actually a success that's P and n was our number of trials so along the left hand side we have probability and along the bottom we have the number of defective items because that's our success so 0 defective items 0.3 to 768 one defective item point 4 0 9 6 and then so on and so forth so this right here is the graph of our discrete probability distribution our binomial distribution for our quality control problem let's go ahead and look at our first example and then we're done so here's our original problem remember we have two sales persons we have Joan and Agro Genesis success rate of 75% and averages 10 calls per day so that's our trials 10 trials Margaux has a success rate of 45% but err which is 16 calls per day so those are our success rate and our number of trials and we're interested in the probability of 6 sales let's look at Joan so here is our general formula so in this number of trials X successes and P is probability of success so for Joan the trials are 10 the number of successes we're interested in is 6 and the probability of success in her case is 0.75 so we can go ahead and insert these into the formula then we go ahead and do that for Joan the probability of closing exactly 6 sales during any given day is 0.14 6 or 14.6% that is the probability in the percentage of Joan closing 6 sales on any given day now let's look at Margo so same general formula but her trials are 16 because she makes 16 calls per day number of successes is still 6 the probability of success in her case is 0.4 or 5 so we go ahead and insert those into our general formula do the math the probability of 6 sales for Margo is point one six eight or sixteen point eight percent now is it higher or lower than Joan well it's higher in it it's higher so you have two things going on we have success rate and the number of trials they're doing each day or the number of calls so even though Margo success rate is much lower the fact that she's making more calls makes up for that and that's what's kind of really cool about these type of problems because you would think that Joan would have more success but really it's Margo because of her greater number of calls made let's look at these two together so on the Left we have Joan's binomial distribution and you can see that the height of each bar is the probability of making that specific number of sales per day so you can see in the green bar her probability is just below point one five but if we go over to Margo we can see that hers is way above point one five its nearing leg point one eight zone or point one seven so six sales Margo has the edge and six sales so for Jenna was point one four six for Margo it was actually 0.168 or about point one seven and that's the green bar now what about the probability of making eight sales in any given day so now look at the blue bars now for Joan that's way up there that's between 0.25 and point three so we can say about about point two eight probability for making eight sales now what about Margo well eight for her is below point two it's way down there so for eight sales Joan's probability is point two eight two for Margo it's point one eight one the four eight sales Joan has the upper hand so you can see that there's sort of a crossover point if we're looking at six sales during any given day Margo kind of does the kind of has the edge because you make so many more calls even though her success rate is lower now we're talking about eighth sale now Joan pulls ahead by quite a bit because her success rate is so high and she makes ten calls per day that she's going to have about you know three quarters of those calls turn into sales as we can see there so you can see there are different things going on this is why I find these interesting we have the probabilities and then we have success and we have the number of trials and we can see those two distributions come into play okay so just remember even though Margaret has lower success rate in terms of making a sale her greater number of sales calls per day overcomes the lower success rate when we're looking at six successes therefore the greater probability of making six sales per day belongs to Margot even though Joan's success rate is higher so the lesson about this is that the probability of any given outcome in this binomial distribution is a combination of both the number of trials and the success rate so we'll keep that in mind as we do more problems involving the binomial distribution okay so that wraps up our discussion of the binomial distribution using two really good examples of how this is used in business and other applications so this is once you keep a couple things in mind if you're watching this because you are having problems in your class right now once you stay positive and keep your chin up I have faith in you and so should you you can do it the fact that you're on YouTube or wherever you're watching this trying to learn more that's what matters it means you're making the effort to try to better yourself that's what matters now if you like this video please give it a thumbs up does encourage me to keep making them if you think there is something I can do better please leave constructive comment below and finally most importantly I want you to keep learning and key having fun learning is the best part of life so once again I thank you very much for watching I look forward to seeing you again next week
Info
Channel: Brandon Foltz
Views: 210,288
Rating: 4.9497371 out of 5
Keywords: brandon foltz binomial distribution, statistics 101 binomial distribution, binomial distribution brandon foltz, binomial distribution, the binomial distribution, binomial, bernoulli trial, binomial distribution statistics, probability, distribution, binomial probability distribution, brandon foltz, anova, linear regression, logistic regression, quality control, binomial experiement, bernoulli distribution, probability examples, statistics 101, Binomial distribution probability
Id: ConmIDAzRqI
Channel Id: undefined
Length: 36min 50sec (2210 seconds)
Published: Wed Nov 28 2012
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.