Statistics 101: Introduction to the Poisson Distribution

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so in this video we're going to continue our discussion of probability distributions so previously i did a whole video series on the binomial distribution so in this video we're going to take the next step and talk about the poisson distribution let me just point out that's how it's pronounced the poisson distribution it's not like someone said in the class once the poison distribution that did happen i promise it's the poisson distribution so it's very similar to the binomial distribution and actually it's kind of a cousin of the binomial distribution so if you do know something about the binomial distribution this will make a whole lot of sense because it works in very similar in a very similar manner now it's most often used not only in statistics in certain situations but also in fields like operations management specifically what's called cueing theory in a queue especially if you're in britain a queue is what we call in the united states waiting in line so you're in the queue or you're in line and if you think about that let's say you go to a bank and you walk into the bank and you get in line then once you're in line you wait a certain amount of time and then you proceed forward to have whatever banking services you need there's a service time and then when you're done with your service then you walk out well that whole process is called queuing theory and operations management and management science so this is used not only in stats but in other disciplines from the very practical implications and actually the example we're going to use in this video is a very simplified version of a cueing theory problem or sort of a waiting and line problem so all that being said let's go ahead and dive right in so this is what i call the checkout line example so there are some people there waiting in the checkout line at the grocery store and here's our problem let's say that you are a cashier at walmart it's 4 30 p.m and your shift ends at five o'clock now the store policy is to close your checkout line 15 minutes before your shift ends so in this case your checkout line should end at 4 45. that way you can finish checking out whatever customers remain and of course you'll take your register probably to the office to balance out and things like that so it's 4 30 you're scheduled to get off your shift at five but they're going to close your checkout line at 4 45 to make sure you get out on time now by examining the overhead cameras like the security cameras in the store the store data indicates that between 4 30 and 4 45 p.m on each weekday we're going to assume this is a weekday an average of 10 customers enter any given checkout line so you can imagine as a customer you're walking around the store and you get all the things you need and then you kind of walk up to where the checkout lines are and you decide probably by which is the shortest which line to get in so once you become part of that line or part of that q you that's what we're considering here so between 4 30 and 4 45 each weekday on average 10 people basically get in line and that's all we're worried about we're not worried about how long it takes to check them out or anything like that just the number of people that get in line so here are two questions we're going to use in this video what is the probability that exactly seven customers enter your line between 4 30 and 4 45 another question we're going to answer what is the probability that more than 10 people arrive into your line between 4 30 and 4 45 now remember on average or we're expecting 10 customers to come into our line during that 15-minute period but we're going to look at the probability for exactly 7 and then the probability for more than 10 which is more than we would expect of course that means you probably would not gotta work on time so here's our time frame between 4 30 pm we're going to say 4 449 59 i'm not going to include the exact 445 but it's really not important now here's the question i want you to answer with your intuition which outcome to the number of customers that get in line is your expected value or sort of the mean number that come and actually you should already know this it was in the in the question so we have a continuum here and we could have zero customers during that 15-minute period it's probably not that likely but it could happen theoretically theoretically we could have an infinite number of customers coming to our line now is that actually going to happen no but that's why it says in theory that's part of the poisson distribution is it could theoretically go all the way to an infinite number of customers but of course that's not going to happen now what number of customers would you expect well it's not a trick question because remember in our problem it said we would expect 10 people to get in line that's what the store figured the average was so that is our expected value that is our mean value and in these type of problems the expected value remember the expected value is just the mean of a discrete probability distribution so expected value and mu the symbol mu there the mean are really the same thing in these type of problems so don't get confused they mean the same thing but again we would expect 10 customers in that 15 minute period now if we graph that in excel and i'm going to do that in a different video to show you how i got this graph you will see that on the left hand side we have zero customers coming into our line that's the x-axis along the bottom that's the number of customers along the y-axis the vertical axis is the probability associated with each outcome so you can see that down at the left-hand side we have the probability of zero customers well that's not that likely as you can see the bar is very very very very short it's very small there is probability there but it's not very much and of course as we go up the number of customers the probability gets higher and higher and higher so if you look at 10 which is the dark green bar there remember that was our mean we can see that the probability there seems to be about .125 you know somewhere in there but also the probability for nine customers is the exact same and sometimes that happens in these discrete probability distributions so remember this is for our time interval of 430 to 445. now even though 9 customers and 10 customers have equal probabilities the mean or expected value is still 10. it just means that the probability of 9 or 10 customers arriving is the same it doesn't affect the mean anyway and that's just what sometimes happen in discrete probability distributions let's remember the height of each bar is the probability for that outcome now as you can tell as we go past 10 and then on to 11 12 13 14 number of customers the probability keeps going down again again in theory this goes all the way out to infinity it's kind of a limit problem so if you've done other mathematics you kind of understand what limits are but it never really you know gets anywhere it theoretically goes to infinity and the probabilities become so small that they are basically zero okay so you can see this is this this distribution is very similar looking to binomial distributions and that's not really by accident because they are related they're not the same but they are related so let's talk about a formal definition here the poisson distribution what it does is it focuses on the number of discrete events or occurrences over a specified interval or continuum that could be time length distance or whatever it's over some continuum and the number of events that happen only within that interval and that what's that's what makes the poisson what it is number of events in a specified interval and we have a new symbol for that it's called lambda so in the poisson we have a new symbol that at least we haven't used in my videos so kind of looks like a i don't know a little triangle thing but it's the greek letter lambda and that is just the number of occurrences over the specified time interval and again this is what you know you can get symbols mixed up but lambda is the same thing as the expected value or the mean these all kind of mean the same thing you could actually substitute them in and actually i'll show you that in a minute and it doesn't change the meaning they are they are the same thing so let's look at our checkout line example lambda in that case is that we have 10 customers per 15 minutes so our lambda here is 10. now it's important to point out there that's not that's not a fraction you're going to simplify i mean you could but it would change it would be like two customers every you know three minutes or something like that but we're talking about a 15 minute interval that is the interval we decided on and 10 customers are expected during that 15 minutes so our lambda here is 10. so do not try to simplify that because we're interested in our 15 minute period so let's talk about a few characteristics more formally just keep in mind that the poisson is discrete outcomes so 0 1 2 3 4 etc the number of occurrences in each interval can range from 0 all the way to theoretically infinity it describes the distribution of infrequent rare events each event is independent of the other events describes discrete events over an interval which we already talked about that's kind of what makes it what it is and the expected number of occurrences are assumed to be constant throughout the experiment so this is something we kind of assume so think about our example during a 15-minute period we expect 10 customers so we kind of assume that those 10 customers arrive to our line evenly now is that actually going to be the case in real life probably not now there are types of cueing problems that you can have a constant arrival rate and i can or a constant service rate and as far as a constant service rate i'll give you an example right now and that is the automatic car wash so when you drive your car into a car wash at least here in the united states your car moves along that car wash at a constant speed and every car goes through it at the same speed so in that case the service of the car wash is constant no matter what car goes through but in our checkout line you know people are going to arrive sort of at different times but for this interval as long as we're dealing with this interval it really doesn't matter we're kind of assuming that everyone arrives at even times so just keep that in mind it's not really that important but it is just a characteristic of the poisson distribution so of course there's a formula that goes along with it and here's what it looks like now that's kind of crazy looking but i don't want you to freak out it's actually fairly simple so here we have the probability of an outcome x so we could say seven customers 12 customers not with what x would be that equals lambda raised to that x power multiplied by e which is a constant okay we'll talk about that in a minute raised to the negative lambda power divided by the factorial of x or it can also look like this in some books you'll see it this way where instead of lambda they use mu but remember what i told you a couple of slides ago that the same thing okay now more formally lambda is the formerly correct way but i have seen it with mu in place of lambda but it doesn't change the meaning it's the same thing so x could be one zero one two three four all to infinity that's the number of occurrences we are interested in so in our problem the first question we're interested in seven so x would be seven in this formula lambda or mu is the long run average which is the same thing as saying as the number of occurrences we have in whatever interval we specify and then e is the constant remember if you have this haven't had this since high school or whenever it's that 2.1718 number which is the base of natural logarithms so it's that button on your calculator that says e okay it's just a constant so all we have to do is plug in or substitute our numbers into this formula to find our probability so let's solve our first one what's the probability of exactly seven customers so here is our formula i'm going to use the lambda version so what is the probability of exactly seven customers that enter your line between 430 and 445. so x here is seven lambda is 10 because remember we're expecting 10 that's our average or long run average and then e is a constant it's always e so 2.1718 now we substitute those numbers into our formula here so lambda is 10 and x is 7. so 10 to the seventh multiplied by e to the negative 10 because it's e to the negative lambda divided by 7 factorial and that's because 7 is our x when we do that in our calculators we come up with .09 which of course is 9 so the probability that exactly seven customers come to our line between 430 and 445 is 9 or 0.09 now if we go back to our graph here we have at the top there we have lambda equals 10 and x equals 7 that's what we're interested in so if we go over to the 7 bar in our graph i've covered covered that in green so you can see it if we go over to our axis look where it meets up at point zero nine so if we can use our graph to check our calculation and they both concur so we know it's correct the probability of seven customers coming into our line in that 15 minute interval is .09 or nine percent now there's a way to figure this out in a ti-83 calculator that's what i use i have a ti-83 plus but i'm sure this is the same for all the ti models so here is our graph and again we're interested in the height or the probability which is the same thing of that green bar which is x equals seven now the function in the calculator is the poisson pdf function and it requires two arguments are lambda and the number of occurrences we are interested in so to find that you just press the second and the vars button that takes you to the distribution menu you'll scroll down and you will find the poisson pdf function press enter then you type 10 comma 7 which will put the command into your calculator so it'll say plus on pdf 10 comma 7 and then of course press enter that will give us the exact probability of x equals 7 and it comes back 0.09 so you can find this exact probability in the ti-83 or ti-84 i'm sure 85 and 89 calculators now a couple of things to keep in mind if you do this on your calculator number one it's very easy to accidentally select the binomial pdf function because they're basically right next to each other so always make sure you're selecting the poisson pdf when you're finding an exact probability now on that note this command in the calculator is for finding the probability of an exact outcome so exactly 7 or exactly 12 or exactly 4 or whatever else we're going to talk about the poisson cdf function here in a minute so pdf is the specific probability of any specific outcome okay remember the second part of our question was the probability of more than 10 customers arriving well if we look at this anything more than 10 what would include 11 12 13 14 15 so on so we're interested in the green part of our graph here but how do we find that well if you remember in the binomial videos i did something very similar so remember that the entire probability distribution has to add up to one that's just sort of the law of probability distributions like this so if we want to find the green what we can do is we can take one which is the whole and then subtract or sort of chop off the purple and that will leave us with the green so we can sort of do a puzzle here and figure out the green by subtracting the purple part from one so how do we do that so here's our graph so what's the probability that more than 10 customers enter your line between 430 and 445. so here's our formula again no x is 10. now why is that why is our outcome of interest 10 well if you look on our graph that is the upper limit of the purple region we're going to want to subtract out so we're going to find the purple area and subtract it from 1. so x in this case is 10. lambda is always 10 for this distribution and then what i said lambda is always 10 for distributions of this problem so it can obviously be different for different problems so e is the constant 2.1718 so now we can go ahead and do this so to find the green part we take one minus the probability all the way up from zero to ten i'll show you how to do that here in a second so we go ahead and subtract that purple part out and we have 1 minus 0.583 equals 0.417 and that is the probability of the green part of our graph now i do want to point something out here which i'll show you in the next slide remember to find the cumulative probability there in the purple in our calculator we will not use the poisson pdf function we're going to use the poisson c df but i'll show you that in the next slide so here's what we're trying to find and the function we want in this case is the poisson c d f and the c in that case stands for cumulative so the cumulative will give us the probability all the way from zero all the way up to ten so one minus that will give us the green part just remember the pdf version of the function gives us an exact probability at any given exact outcome the cdf gives us the cumulative all the way from 0 up to whatever number we specify in this case 10. so we go ahead and do that subtraction out and the whole area of the green is 0.417 which means the probability of more than 10 people arriving into our checkout line between 430 and 445 is 41.7 percent that's a pretty good uh probability or pretty good percentage as far as more than 10 people coming into our line during that time so our answer summary was the probability that exactly seven customers into your line between 430 and 445 well was .09 which is nine percent was the probability of that more than 10 people arrive well that was 0.417 or 41.7 percent so again these are just the probabilities of these specific outcomes given the time interval we're interested in and the number of people we expect the mean number of people we expect to come into our line based on some previous data we have or calculated so finally a bonus question let's say your manager decides to be nice and you get to close your register at 440 instead of 445. how many customers do you expect to arrive between 4 30 and 4 40. so here is our a change we got to do we expected 10 customers in that 15 minute interval what if we change it to a 10 minute interval now that 10 minutes is within that 15 okay so it's not outside that 15 but the 10 minutes within that 15-minute interval what would it expect well actually the simple cross multiply ratio problem and that is 6.67 customers so if we close our register at 440 during the 10 minutes from 430 to 440 we would expect between six and seven customers somewhere in there to come into our line so if we change the interval we have to change our expectation of customers but again that 10-minute interval is within our original 15-minute interval and that's very important so what if we do a graph of this distribution well it looks like this the time interval is now 430 to 440 and our expected is 6.67 so we can use excel to do this graph and if you look the most probable outcome is 6 customers and the probability of that is over 0.15 or so 0.155 something around there and you can see on either side of that the probability gets lower so the probability of 5 is a little bit lower than 6 probably of 7 is a little bit lower than 6 and it falls off from there and looks a very similar shape as our previous one okay it's a quick review and we are done remember the poisson can have discrete outcomes 0 1 2 3 etc the number of occurrences in each interval can range from 0 to theoretically infinity it describes the distribution of infrequent or rare events each event is independent of the other event it describes discrete events over an interval so an interval of time distance etcetera and that's important that's sort of what makes a poisson what it is and finally the expected number of occurrences are assumed to be constant throughout the experiment and that's important because of the past question i did in the bonus section so i changed the interval from 15 minutes to 10 minutes to 10 minutes within the original 15 minute interval because i assumed the arrivals into the line were constant i could do a simple ratio so again in real life it doesn't always work that way but it's a good enough approximation that we can go ahead and make that assumption okay so that is our introduction to the poisson distribution now i'll be doing some more videos on this some more examples some more real life applications so make sure you check back for those if you're just watching this for the first time it's a few things to keep in mind if you're watching this because you are struggling in a class i want you to stay positive and keep your chin up if you're on here trying to improve yourself in your education that's what really matters okay the fact that you're making the effort to learn and understand and better yourself that is what really matters at least to me and i hope it does to you please please feel free to follow me here on youtube and or on twitter that way when i upload a video you know about it if you like the video give it a thumbs up it does encourage me and other screen casters to keep making our videos on the flip side if you think there is something that i can do better leave a comment below and i will try to incorporate that in the future videos and finally as you learn i want you to have fun i want to enjoy learning knowing that in the future
Info
Channel: Brandon Foltz
Views: 260,262
Rating: 4.8999515 out of 5
Keywords: poisson distribution explained, poisson distribution, poisson, poisson probability distribution, probability, introduction, poisson distribution probability, distribution, poisson distribution probability examples, brandon c. foltz, brandon c foltz, statistics 101, brandon foltz, introductory statistics, statistics, linear regression, anova, logistic regression, probability examples, stats help, stats tutor, intro stats help, intro stats videos
Id: 8px7xuk_7OU
Channel Id: undefined
Length: 29min 16sec (1756 seconds)
Published: Wed Dec 12 2012
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.