Chi Squared Test of Independence

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
this is dr. Mears and today we're going to learn about the chi-squared test of Independence this is the example that we did in class two together when I taught life dr. Mears wanted to investigate if gender and prefer color of shirt were independent this means that she wanted to find out if a person's gender influences their color choice she conducted a survey and organized her data in the following table so this table is our observed this is what we observed as we went through and compiled the data from our survey so this is called the observed table doing a chi-square test of Independence you received this worksheet and what I did is I filled it out so that the video would go a little bit faster um so we are going to go through the steps just like I did in class however some of this is filled out so you may have to pause if you weren't in class to get the notes or you may want to pause anyways to make sure that you're understanding correctly and asking any questions okay here we go I'm step one we have to write the null and alternative hypotheses so we look to see here's the h0 so this is our null hypothesis here and we fill in our two categories gender and preferred shirt-collar are independent and the null hypothesis or the H sub o is always our independent meaning and then they have no the dependency on each other they don't influence each other at all now our alternative to that is our H a and our H sub a again you fill in the same categories the same way gender and preferred shirt color are not independent and so you could rate dependent here I like to keep the wording just the same so one is are independent the other one is are not independent which means that they have some type of influence on each other so let's go through and perform perform a chi-square test to see if we are going to reject or do not reject our null hypothesis here we go step two we have to calculate the chi-square test statistic so what we have to go back to is the observed or the original table they gave you in example and we have to add on the row column and overall total so this is the table that they gave you here with the numbers we are adding on our total columns total row and overall and in class I said please label so these are our row totals down here are our column totals and this is our overall total so in order to get your row total you're adding 48 1233 57 and that should give you 150 which means there are 150 males that were surveyed for females 3446 42:26 that equals 148 so there were 150 male surveyed and 148 female surveys for a total of 298 people for our column totals here 48 plus 34 which is 82 that means that 82 people preferred black shirts here 58 people preferred white 75 preferred red and 83 preferred blue so now that we have our row total and column total we can go to Part B Part B says make an expected value table from the totals for each entry do the following so for each one of these we have to get the row total multiply it by the column total and divide it by the overall so let's say for male we would look 150 prefers a black shirt 82 so for male here this is the row total for male and for black shirt this is the column total so I have filled it in I'm going to go through quickly with you hopefully you can see it the numbers okay because I did write relatively small so again for male who prefer black 150 which was here times 82 times 82 / 298 I did go to three significant figures so correctly rounded to three significant figures is 41.3 I'm going to go to male who preferred a white shirt so again male 150 white shirt 58 150 times 58 divided by 298 correctly rounded to three significant figures is twenty nine point two I'm going to do it for red and then blue notice the 150 is the row seventy-five is the column for red eighty-three is a column for blue and I'm still dividing by 298 I'm going to do the same thing for female but what changes here is my row total which is now one forty eight female who prefers black eighty-two so 148 times eighty two divided by two ninety eight then on to female who prefer white one forty eight times 58 divided by 298 red 148 times seventy five for red / 298 and finally blue 148 times 83 divided by 298 each one of these was calculated out to three significant figures Part D check to make sure each number in the expected value table is 5 or higher so we have to look here forty one point three twenty nine point two thirty seven point eight forty one point eight all of these numbers here have to be five or greater if they are not in your ia please come and see me so all expected values are greater than five so this is the sentence that you're going to see then you're going to stay may proceed with the chi-square test so you have to physically write down that they are greater than five if they're not come and talk to me okay on to Part D now Part D is calculating the actual chi-squared value and so I'm actually going to put this up here and in order to do that you have to use this formula this formula is a chi squared formula the chi squared equals the sum of the observed minus expected quantity squared divided by the expected value so what we're going to do is we are going to be going to the observed so let's take male who prefers a black shirt so we have 48 so that's the observed we're going to subtract the expected number so this is our expected number so it's gonna be 48 minus 40 1.3 and that's what I have here 48 minus 40 1.3 square it divided by its expected which is 41.3 so these two numbers here should should be the same so this was the observed minus the expected squared don't forget divided by the expected Oh minus e squared divided by e then you go on to the next one so the next one was going to be male preferred white shirt so that's gonna be 12 come down here male preferred white shirt 29.2 so that's twelve minus twenty nine point two don't forget to square it divided by twenty nine point two again these numbers should be the same here that's my observed value the next one is male who preferred a red shirt observed as 33 male who preferred a red shirt the expected is 37.8 33 minus 37.8 squared divided by thirty seven point eight next blue shirt men who prefer blue 57 men who preferred blue 41.8 57 minus forty one point eight squared divided by forty one point eight we're not finished because that's just the male's we have to move on to the females so jumping down here is thirty four females who preferred black that is my observed females who preferred black forty point seven that's my expected so we see 34 minus forty point seven squared divided by forty point seven the same I did not forget to square that is my observed next forty six so we have females forty six preferred white down here females expected twenty eight point eight so 46 minus twenty eight point eight squared divided by twenty eight point eight I just jumped down over here ran out of room I'm going to finish up the last two so the last two is red which is 42 for female - 37.2 squared divided by 37 point two and four blue twenty-six - 41 point two divided by oath dice a forty one point two here I'm sorry thirty-seven point two you can clearly see that though so this one's forty one point - don't forget to square it so I like writing all these out first as you can see and then making sure I'm double-checking with both of these please remember when you're doing this you're not using the totals anymore the totals were just for this expected value chart we're just using the inside numbers now to compute our chi-square test statistic okay so after we're done filling all the numbers in take out your calculators and you can actually put in I like to do one at a time I get very worried with all these decimals and notice I did go to four decimal places for each one I wanted to have a much more precise number for chi-squared so I didn't go to three significant figures I just kept it with four places after the decimal which is fine I didn't want a roundoff error so what I did is I actually just did one of these and calculate it out then I did this one calculated it out they did this one calculate it out so on and so forth down the line after I did that I double-checked my work of course and then I added them all up and received 34 sorry 0.957 - so now if you were a little off on these decimals it's probably because you may have not gone to four decimal places so you may have had a little roundoff error so you try to get as close to thirty four point nine five seven two as you can so it's okay I've set these decimals are off a little a little bit if they're a lot then come and ask me we'll see what mistake you made in your work so that's the chi-squared value so we're going to need that one for a little box around it because we're going to come back to that step three we need to grease of freedom so degrees of freedom is number of rows minus one and number of columns minus one so the number of rows the rows are going to be going cross and so we see the Rose there's male and female which is to you do not include the totals just the original values so actually I can even go back to the observed here so number of rows 1 to here and the number of columns columns are down so 1 2 3 4 so number of rows was to minus 1 number of columns for minus 1 to minus 1 is 1 this is multiplication 4 minus 1 is 3 so 1 times 3 the degrees of freedom is 3 and I did Circle down okay step number 4 we have to look up some critical values now for your ia you most likely are going to be using the 5% that is the default in statistics so we what we have to do is look up these critical values using our degrees of freedom so to do this look up your degrees of freedom and the significance level you wish to use so I want you to try all three so that we can get these numbers so now how do we get eleven point three four five so you take out your table that I copied for you and you go to the degrees of freedom our degrees of freedom was three so we go right here then what you do so sorry so you go to three so here we are three degrees of freedom which is right here I want to find first an alpha of 0.01 Oh which is at the one percent significance level so I'm going to look for oh one oh so we see over here chi squared and then little tiny it says 0.01 oh here that's eleven point three four five so we went to the degrees of freedom of three all the way over and stopped when we saw 0.01 oh and that's the number that I got on the paper here eleven point three four five now let's try a significant level of 0.05 so at the five percent that's an alpha of 0.05 oh so what we're going to do we're going to still stay at 3 degrees of freedom of three degrees a freedom of three now we're gonna look for 0.05 and that gives us seven point one eight seven point eight one five sorry seven point eight one five and so that's what I got over here seven point eight one five now let's go to the ten percent level which is an alpha of 0.1 0 zero so 10 percent we're gonna be looking for a point one zero zero notice they also go to three significant figures same degrees of freedom you use it throughout your problem that doesn't change for this problem it may be different for another one but once you're set on a degrees of freedom you stick with it for that problem and we're going to look here is 0.01 six point two five one and that's what we got over here six point two five one so now we have to compare we have the chi-squared which is going to be thirty four point nine five seven two and that's the number that we received up here chi-squared thirty-four point nine five seven two all i did as i bring it down the chi-squared value has tasked to go first this has to go first the critical value and I said you're most likely going to be going with that five percent level the critical value is seven point one eight five you could choose these others but then you're choosing a different degree of confidence and so I talked about confidence levels and if you want to discuss more we can this video is just how to carry out this chi square test so Chi square its first critical value you chose the second we compare these this is greater than so we're going to be looking greater than so we look down on the bottom which one is greater than this one says less than this one says greater than this gave us a greater than so we're going to be using this one as you can see chi squared which is thirty four point nine five seven two is greater than our critical value of seven point eight one five therefore we're going to reject H zero so we're going to kick that null out therefore we have evidence that gender and prefer shirt color are not independent we can further say that there may be an influence we have evidence that they may influence each other that gender and preferred shirt color may influence but this test just tells us are they independent or are not independent and we found because our chi-squared was greater than our critical value that we rejected our null hypothesis or H 0 and therefore gender and preferred structure color are not independent so this is how you carry out a chi-squared test by hand we are going to be learning about how to use the calculator which is going to be step 5 here as you can see on your packet however I am NOT doing that for this lesson that is a different lesson on a different day with a different video so this has been high score test hopefully it helped thank you
Info
Channel: Stephanie Meerse
Views: 40,022
Rating: 4.8791733 out of 5
Keywords:
Id: 5Nrq_jqu9Ao
Channel Id: undefined
Length: 16min 36sec (996 seconds)
Published: Mon Nov 27 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.