Statistics Lecture 6.4: Sampling Distributions Statistics. Using Samples to Approx. Populations

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so we're going to talk about section 6.4 we're going to talk about sampling distributions here's the idea in every other section that we've covered so far in Chapel chapter 5 and chapter 6 up to this point you have been dealing with populations notice how you've been writing mu and Sigma for all of your homework hopefully you probably did on your you're going to turn it right signifying that we can deal with entire populations of people we don't always deal with populations in this class this class is called statistics because we're going to be taking samples not populations in general it's very hard to know everything about entire population we deal with the sample instead however in order to make the transition from C with populations to letting samples approximate our populations we need this section sampling distributions to tako a little bit about it now I'm going to warn you the reason why people find the section confusing there's a whole lot of theory a whole lotta theory and not a lot of application so there's no real problems that we're going to do it's all a bunch of theorists ok that's why people find it confusing that's why I need you read through the section on your own as well as today this might not do it for you completely I need to read through it when you get home as well it says section 6 point 4 with me so far right so our whole idea here's the underlying idea underlying idea is that we're going to try to use sample measurements to estimate population sample statistics to estimate population parameters so the idea use a sample to estimate a population and of course pugno specific characteristics of those things you see it's not just good enough to say oh I'm pretty sure this sample is going to represent this population and let's go for it you can't do that without a whole lot of legwork and that's this is the legwork for that so what the first thing it needs are notice is that if we're dealing with a population of people or population of anything then if we're going to take a sample from that there's a whole lot of possible samples we could take right right let's say this is your population this classroom is all the people that exist in the world that before you we don't know each other very well I'm sure but let's say that this was the population okay and I said I want to pick out a sample of five people how many balloons one two three four five six seven eight ninety now there's 20 70 that's it I wonder from right I just randomly guess that so there's 27 of you let's let's pretend there's only 27 people in the whole world you 27 people you're the chosen ones and I'm going to pick out samples of 5 because I don't wanna deal with all 27 years of 5 is there more than one possible sample of five well I could pick you 5 I can pick you fine that's two different samples right here I can pick you Bob I could pick all four of you and then her instead of you at the very end that's a different sample of five isn't it even though there's more the same that one's different are there lots of different samples in this classroom of five people it's coming a lot specifically comedy let me change this oh sorry those I use lowercase letter n for a sample uppercase letter n for a population hopefully you remember that from like our first couple days of class do you remember from our first couple days of class good how many samples of size n are there possible out of a population of size n I'll just sit in twice so if you're not read and you're not going to instruct at all so how many of this size and samples are there out of size n so we count well I just use this example there's 27 people in class is called a little more now Samantha thank you but let's say there's 27 people I'm picking samples of size 5 you can see we count them let's try it otherwise you need a better way to do it so you stop me if then you figure out a better way dear son you fight that's one there's two there's three there's four madam there's but there's six there's several slubby keep going there's they're staying okay there's these four there's not a score and they're just there's ten you all know that you do it in a minute okay do you want to do that you'll be here for the rest of your life you just did that that quick it's amazing you bastard it's pretty good one more question let's say that this is my sample of five people let's say I picked her up then her out bend him out that her out that her up would it be different if I picked him out first and then her than her than her than her is that the same sample or a different sample you know the say five people did if I asked them all the same question does it matter who got picked up first this order mattered her disorder not mattered do you see how I'm going with this how many people in her class for this experiment twenty-seven we're picking up samples of size we want 27 people let's have the satisfied how many possible ways can you do that where the order doesn't matter what are we talking about say it louder why not a permutation that would be the order with madam however for a sample the order in which you pick it up as it matters as long as you have the same quantity are the same exact people no matter which way you would mean that so we're dealing with the combination so how many possible samples are there if you have n people or n items in a population and you want n of them it's coming that's how many possible samples there are if we have 27 people in the classroom and we want samples of size 5 say that again how much was it like that that's what you did if we did this in this classroom and I picked out samples of size five there would be eighty thousand seven hundred and thirty different samples of size five with just this class how many wonder so that do you want to take all them John do you want to actually do that because to accurately represent a population you either have to take every possible sample or you'd have to use this theory step the rub I'm about to show you you're ready to learn it trust me you're going to want to learn this otherwise you can't make the next step otherwise you have to be for you force to take all the 80 dozen seven or thirty cents don't that's a lot eighty now I'd have to be counted I counted as a five or eight or something like that there would be eight thousand seven hundred twenty two of those remaining for me to go through for it I exhausted all possibilities now to go any further I need to tell you what a sampling distribution actually means a sampling distribution is assembling distribution of blank we can have one of three words here we can have the mean to pour it well one of many words actually the mean the proportion the variance with standard deviation sampling distribution of and then we're going to have some sort of base to put in parentheses so you know it's some sort of statistic like the mean or the proportion of data or the variance so such as me proportion instead of of course etc is not a statistic for us it just means and so on so sampling distribution of some sort of statistic what a sampling distribution says is this kind of abstract idea because no one ever does it this near the sampling distribution I want you to assume or pretend like you just took all the samples that are possible to certain size out of a population so stick with me on this so for our example you would have taken all eighty thousand seven hundred thirty samples you would think you would have found out from that sample what the mean was or what the proportion was or what the variance was from those five five people so like I'd ask you like what your height said what was your height average it that's one statistic from that one sample right be following me along here you do that eighty thousand seven hundred thirty times so you have eight thousand seven hundred thirty different samples that means you're going to 80 thousand seven hundred thirty different heights for those groups of five people you within you organize all of those into a table and that is a sampling distribution so what a sampling distribution does is this takes all the different samples for certain size of a population finds out the key statistic you're looking for for each one of them and it puts them in the table to understand idea of a sampling distribution sampling because you're taking samples distribution because you're organizing sampling distribution now let me ask one more question for you li further what if I change this to not just groups of five I want groups of six now does this number change then this would change so this is for each particular size of sample that you want if this number changes does this business number change yeah so there's a very specific thing set population size versus sample size every possible sample that size then he organized this as statistics for each one of them are you ever going to do that no one ever does that that's a lot we're going to look at this in a minute so sampling distribution I'll write this upwards for you you're going to assume that you take all the possible samples of size will depend let's do assume you take all the possible samples of size and so again for us that would be every possible sample size 5 in this classroom and you find the certain statistic for each one of them as they find the that's a required statistic so the one you're looking for for each sample in our case that would be all 80,000 730 of them we would have 80,000 730 different means we'd have 80,000 730 different proportions 80,000 730 different variances not even if you're understand that if you organize all of those in a table or a graph or cetera this is called a sampling distribution if you organize all of those different statistics for a different sample into a chart and graph table you are going to get a sampling distribution all this a table is you can make a chart out of table and so on so much riding my pen amount of ink stand on me I'm going to kind of recap this one more time is simply getting a new pen every time I miss okay so so here's what the idea you have this population of a big end sighs Hannah you're taking you say I want samples of size you pick that so like 10 5 whatever I pick 5 the number of different samples you can get out of that population for a certain size remember you're picking that size ahead of time my guy picked five can change that is cap begin choose little end that's how many different samples are possible with any situation of a certain size we take the statistic for every single one of those samples all eighty thousand seven or 30 minutes case this is a relatively small population right and small sample size you increase these you get massive numbers you organize every one of those statistics with their appropriate well the requires a statistic you you've been given such as the mean of a portion into a table and that's called a sampling distribution where you have to understand the idea of a sampling distribution good okay now let's learn about it let's learn really what happens here I'm going to give you a very simple example to illustrate what sampling distributions are going to mean for us you ready to follow this example you got to kind of follow this we're going to pretend that in the whole world there are only three numbers only three numbers you can possibly have are one two one five so if you started counting you'd go I have one years old then I have two years old then I'm five years old and that's it it would all be five years old that's it so there's no other value besides one two and five you with me no other numbers exist it'd be like having a three-sided die which is literally impossible but let's pretend it is for this this fact you know he can't get it excited that but if you let's pretend you had a three-sided die the only numbers on there would be 1 2 & 5 4 / R 1 2 & 5 that's it this is pretty even labeled 1 2 5 and you get to roll it you understand so one thing you get out of it is 1 2 & 5 that's another way to think this how many odd numbers you enough I'm going to even numbers you at okay thankfully we got that we get that right we've got two odd numbers to have one even never can you tell me what's the proportion of odd numbers what's the proportion of odd numbers that means like all not a percentage but a decimal version a percentage how many odd numbers do we have out of how many numbers total so our proportion is 2/3 you'd say two-thirds of these numbers are off you within that our proportion is to lose on the proportion of odd numbers is 2/3 what we're going to do now this is not a sampling distribution by the way this is looking at the entire population understand that this is the population population proportion are you with me that's the population proportion population proportion is a lowercase letter P so in our case P equals 2/3 the population proportion of odd numbers in our case because this is our population you following that right that's it that's all that's possible is 1 2 or 5 so the population proportion to 5 numbers is 2/3 this is a very very simplistic example I only have 3 items this would be 3 ok that's it it's all you can choose from is 3 now what I'm going to do is I'm going to create for you with you a sampling distribution what we're going to do is we're going to take all the possible samples of size 2 all possible sample of size so big head equals three little n equals two now that something interesting happens happens here I'm going to break the rules a little bit these are all the different samples that you could get however I'm going to label the different orders of these as unique samples just select the probabilities and come out right because you have the different order to raise the probability because you're doing with two samples of two numbers at a time so you have to multiply and there's some sort of ability so I'm going to cheat just a little bit we're not going to get the do three see two on your calculator real quick and you three we're going to put three actually you need samples however what I'm going to list out is every possible ordering of those samples because out of those three samples the probability is actually higher for certain ones of them does that make sense and so what we're gonna we're going to fudge a little bit there's two ways you can do it you can change the probability or label out every for every possible choice so I'll show that to you and I when I create this we're going to make a table here what we're going to do is list out every possible sample on the left and I'll show you what we wanna do on the right give me the first possible sample this is with repetition so with repetition remember it's like you're rolling a three set of die which literally is impossible but let's try that so what could you get for your first sample looks like you're willing to die the numbers you can pick out or pick out a hat with replacement you give one one couldn't you then you could get one too you follow then you can get one one I'm going to run out of room I did make my numbers one okay big one cool no one as they're going agree these are the nine possibilities that you get yes no that's it thought I do need to know something that according to the samples this one and this one are the same you followed this one and this one are the same this one in this sort of the same so you have less actual samples than what's listed however this is every popsicle rolling that'll die this is everything that possibly could happen so if you were to make one one sample out of these ones you have to alter the probability of getting those which we're going to do so I don't want to alter the probability because this would this is nothing this is one sample it will only happen one time this can happen to two different times as in this one and this one but not that one so I don't really want to confuse that idea so I'm going to keep this listed as different samples you kind of get the picture so even though I know these are the same for the probability wise I'm going to leave them as independent and deal with the probability that way budget if you're still okay on that that's a little confusing topic I get that that you said that the different already didn't matter for samples it doesn't but it does matter for probability and we are combining the idea of different samples with probability right now ready what a laptop so this is our our different ends that's our different ends our samples samples of N equals two what I want to do now is I want to figure out the probability that each of these things is going to happen how many choices do I have so what's the probability that this one's going to happen yeah we're assuming that the die are three set of die is not weighted funny right it's a normal three sided die so this probably would be one not you okay with that what's this probability once all the probabilities here's kind of what I was talking about if you were to combine these samples like this one and this one you alter the probability to two ninths that's the only thing that we do but I needed to show you this in order to do that does that make sense so the probably for this one would stay one night there's not another one of those probably here would be 119 state probability this one that you combined it would be to nuts of you two nights so far so good okay now if this is the probability the other thing that we're going to talk about is the sample proportion the population proportion is a slower case letter P in our case for our population we had two thirds of those numbers being odd there's also another country to talk about and that's the sample proportion the population proportion is P the sample proportion is also P with a little hat P hat not kidding seriously I'm not even making a pronounce P hat so aren't you glad you don't have a P hat P P hat stands for stands for sample proportion usually I make stuff like that up just to mess with you guys but I'm going to make mechanized honest that's good enough as a okay so firstly all you okay that these are all of our samples yes no with replacement you may also notice that that I'm doing something with replacement but in real life would you ever do that with replacement if I wanted samples of size 5 but I go okay let's pick up out five people Kayla what you think okay good Kayla what you think so weak someone else okay Kayla what do you think let's go to some Kayla what you think and we need one more how about Kayla Kayla what do you think would you ever do that no you wouldn't so what what you're going to read something in that section so I want you to read that about why we can consider that with replacement in this situation okay that it explains that in the book pretty well read through that why we are doing this with replacement it's a good little little wreath number what page it's on but read through there that's going to explain in there why the with replacement verses without replacement we need to put replacement you do this so are you okay to all these probabilities are 1/9 we also want to find out the P hat we're going to learn something right now that's going to go to tell us a relationship between P hat n P our sample proportions and our population proportion what is the it's the P stand for again the P in this case what are we looking for okay they have this would be population proportion this would be sample proportion of what what specifically look at our problem what specifically are we are we looking at the proportion of what odd numbers odd numbers so two-thirds of these were odd numbers we're going to go down this right here and figure out the proportion for each of these for each of these as far as odd number Scouts so in question how many numbers do I have here and how many of them are odd so 100% of them 1.00 are odd numbers 1.0 do you follow that let's apportion all of them what is the proportion for odd numbers in this case how much 1.5 sure half of them what's the proportion for this case how about this case how about the two - how many how many odd numbers we'll have there how about here how would Jiri and lastly we have 1.2 okay we're going to recap for about ten seconds on what we've been doing we have a population that has only three items in it one two and five the proportion of odd numbers in that population is two-thirds we're now two examples of size two for every sample of size two we listed it out with replacement the probability of getting any of those samples is 1/9 the proportion of odd numbers is what we list it out for each sample right here to feel okay with what what this is to understand that this is a sampling distribution this is every possible sample of a certain size that size is two this is what we're looking for the proportion of each of those so that's our statistic in this case is the proportion fortunate there is that statistic you with me now the question is and I accidentally erased my to do my P hats equal my feet human do my pee pads evil might be is a hard question don't mess up the pee I'll get really pissed off get it ha ha that was funny you can laugh pretty Tila Jew my pee has equal my pee does this pee pad equal monkey this is one point zero the pee is 2/3 is is the same is this the same is this a sin or any of these the same as RP 2/3 no not one of them not one of them however something really really intersting this is the key point here for the holding the key point here is saying something really interesting is going to happen if I average this situation but find an average of its so well none of these equals 2/3 what I want to do is find the average now if you don't remember how to do an average of a distribution like this what you do is you add up the x times the P of X we're going to treat this as our X so in our case it's going to be the P hat times the P of X or the P of P hat we're going to multiply these things and add that poem that's how you if you hopefully remember about a frequency distribution that's how you found before probably because that's how you found me you multiply these two things and you added that ecology remember that I hope that you do so right here when I extend my my column I'm going to have P hat times P of P hat we're going to some fancy fraction stuff there's only three numbers when you work with zero I'll take here the zero square one I'll take there's one squared and point five that's the only only top one one times one to one point zero one times one nine gives you one night do you follow half of one night you could have left that one app one half of one night is 118 you see where the 118th discovered from system one night divided by 2 is 180 half of one night is 118 this is one night this is 118 this is how much is this 0 this is 118 this is one night this is 118 and this is one night are you okay with that so far are you sure so we've multiplied this column this one times this one we've got P hat times P of X if I add up this Hall right here I'm going to get the average got your if you're a user computing for some people if you're confused I need to know do you know where these numbers are coming from one night than 118 do you know that if I add them I will get the mean just like any other probability distribution is still the same thing which happens to also be a sampling distribution if I add this column that's the what happens here this one this one actually give you another 1/9 because you'd have to 18th right that's going to give you one that let's cross that out this one in this one also give you one night let's cross that out just to make it easier here's why we have 1/9 plus 1/9 plus 1/9 plus 1/9 plus 1/9 plus 1/9 that's one two three four five that's six ninths did you follow how I'm getting six ninths if you don't just add all these up on a calculator you're going to need six ninths so the mean equals six wait a minute six now is six nice what the heck I'm Harry Potter and this is my bath magic wand right there wait none of these were two-thirds not one of them not one of our samples sample portions had two-thirds yet when I averaged them it equals exactly two-thirds is that magic no it's actually expected think about what you're doing you're taking every single possible sample right every single one so sure you're going to get some that are higher some that are lower so they're higher some are lower some they're way lower lower higher lower higher but you're averaging them if you add up all the possible explanation samples and you average them you're going to eat exactly what your population originally gave you exactly it now this might not seem like such a big deal for you but in our populations we don't just have three things doing we have like 27 things or we have like a million things that we have 370 million things like in America that's something people we pretty much have it's a lot of people we can't do this all the time right but what this says is while every sample does not give you the population proportion the average of all the sample proportions will give you the population proportion in other words sample proportion targets this keyword force targets population proportion it doesn't systematically overestimate it or underestimated it will be around it now with a sample size of three yeah we're going to be way up but with the sample size of more than 30 we're pretty much running money or work really really really close so as our sample size increases our ability to target the population proportion is increased as well so while this one's over this one's done well this one's over the 2/3 this one's under it's not doing all over it's not doing all under some are over some 100 it's not systematically doing any one thing we call that targeting if I take average of it is exactly same thing so we would say is that this shows that sample proportions the average of our sample proportions equals a population proportion that means that our our sample portions target our population proportion if you're still not quite getting why this is important we just proven you follow me that this is targeting it right on the average I mean it's giving you the same thing it's not systematically over or underestimate that's what target means if we have a statistic that targets a parameter or in other words something about our sample that targets something about our population this says that the sample statistic is a good estimator for the population portion this is the legwork we needed this is showing that while this is not writing the money it's going to be close enough if our sample size is big enough well this is not exactly accurate it's going to be a good estimator overall so what this says is the sample for targets pub is important proportion or in other words we can use a sample proportion as an estimator for population proportion so targets and estimates pretty much synonymous in this case estimates so this right here is a sampling distribution of the proportion that's what that is sampling distribution of the proportion what we're gonna do right now is construct another one for the mean I'll show you so the Atmel wheel our data so let's do a sampling distribution of the mean same idea one one one two one five two one two two two five five one five two five five these are our samples of size N equals two this is my probability they're all going to be one night but now instead of dealing with the proportion I want to look at the mean oh by the way what's our population again population is 3 let's review the population size what is the population what is it 2/3 2/3 that would be the population proportion what's the population what are the things we're dealing with folks what are the only three numbers in the world that's your population but the population of this class is the people who are in this class the population we're dealing with is one two and five population is one two five offensive there are three items in our population the proportion of odd numbers is two-thirds but our population is one two and five it's one two and five you're with me now what I want to do is I'm ignore forget about the proportions right now this was one example this is a different example this is not related to this one in any way shape or form you okay with that new example we're now talking about they mean what I want you to do is find the mean of your population find the mean so in three now motion to do you look me like like I'm crazy you're confused how do you find the mean cool animal up divide by the number you added what do you get when you add them all up you get eight and what are you going to divide by three divided by three what's the symbol for population mean that you're going to use Wow all right do you agree that your memes can be in thirds what's a simple for the mean this is not a confusing question something you've been doing since the first day of class what is it simply used for the mean of a population you yeah it's not not sure question the mean for the population here our mean is a thirds you add them all up you divide by the number you added you okay that our musial 8/3 remember you are not doing any more they ignore this examples we're on something new all new game will evolve you but talk about the mean now you find the mean which you know how to do that those on your very first test the mean for the population is labeled mu we can eight over three we added them all up we divide by the number you added you get 8/3 what we're going to do now is for every sample we're going to find the mean now what's symbol do you use to represent the mean of a sample it's not you use for population what you use to represent the sample say that again x-bar number next part we rarely used x-bar because we've been doing population so much but export you to come back at us what we're going to do is calculate x-bar like in what does x-bar stand for don't just say the mean you got a specific family good what some you stand for are they different one deals with a little sample one deals ad okay okay the population mean for ages in this classroom should be one thing the sample age of these five people is going to be a different thing right generally it's going to be different let's find out the mean for each of these for each of these samples the mean is you just add them and you divide by the number you added how many are we adding in every case so we're going to be divided by two so we have what's the mean of one and one how much is it it's one sure you add them up you divide by two you get one what's the average of one and two witches once the average of one and five six over two how much are you guys with me we're getting these numbers are you sure I got to have you here I got to have you you're with it if not you let me know I can go back and coach this again are you understanding where the one the one foot five are coming from here's averaging these things so the averages means sample means in this case with so means three point five two three point five three three point five five I didn't make them up on average these are those numbers have added them all divided by two in my head I'm gonna fill getting that far good all right now just like we just like we did over here this is gonna be a little different we are going to find the average of these things because I want you to notice something eight thirds is two point what six six six six six six forever are any of these averages two point six six six six six six forever but look at them some of these are below two point six six below below some are above above below below above above above above some are below so we're above it's not systematically above it's not systemically below some are above and so verbal oh we're going to go ahead and average these averages so again alright different sampling distribution do you agree this is every possible different sample of size two we found the mean of every different possible sample size we've put in a table this is a sampling distribution it's a distribution of all the sample means is with me that's what it is we're now going to average these means we're going to average the averages how you average averages in this case by the way I'm cheating on the civil we'll talk about that in a while will be the sum of X bar times P of X bar for this case we're going to do that we're going to go ahead and we're going to calculate X bar times P of X bar or in other words sorry P of X or our sample X bar times the probability of each sample occurring oh we get 1/9 1.5 oh my gosh that's going to be 3:18 no that's not money our line all right 13 so a little here you need three nights 3/18 again two nights see that's 7/18 three nights seven eighteen and five minutes might have been a little bit easier to leave those as fractions but that's okay do you see where the fractions are coming from them I'm simplifying these two numbers you get it to get those if I add all those up maybe do a little bit of conversion here change all these are the 18s if you really like to this would be 218 this would be 6/18 if I add these I get two plus three that's five then I get eleven twelve thirteen fourteen eighteen twenty five thirty one thirty eight forty eight so I add them all up I get forty eight over eighteen can you reduce forty-eight over eighteen what goes into both numbers six goes in this gives you eight over if you add up this call of numbers using calculator if you're not very good at fractions which hopefully you are go to fractions but if you're not add up all those numbers reduce it you're going to get eight thirds or two point six six six it takes forever which ampule okay with that so far so what happened or each of these sample means the same as our population no but when I averaged them yeah it's exactly same what this says for us is that sample mean targets or in other words estimates population so so far we know two things we know though that here's basically the underlying concept for this session is this proportion sample proportion will be a good estimator for population proportion sample mean will be a good estimator for population and that's all this whole section sense there's one more sample variance will ultimately be a good estimator for population variance the problem is standard deviations between standard deviation is actually not a great estimator for population standard deviation it's systemically underestimated and so we're going to have to fix that but it's going to be close enough for us to understand so our section about sampling distributions was simply you should know what a sampling distribution is it's this concept and then if we average them together the average of those statistics will give us or target or estimate the population right now maybe we felt okay about our per second it will wrap this up next time
Info
Channel: Professor Leonard
Views: 141,753
Rating: 4.9146667 out of 5
Keywords: Professor, Leonard, Sampling Distribution, Statistics (Field Of Study), Sample, Probability Distribution, Sampling (Literature Subject)
Id: kBYt67NDToI
Channel Id: undefined
Length: 50min 20sec (3020 seconds)
Published: Mon Dec 12 2011
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.