Excel Problem Solver: Are Mega Millions Lotto Drawings Biased?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome to spreadsheet geek in this video we'll take a look at the mega millions lottery drawing the jackpot as i record this video is currently estimated at 850 million dollars near record levels if you're going to play the mega millions lottery is there any advantage to playing certain numbers that's the question we'll seek to answer in this video this video was made using microsoft excel 2019 [Music] currently 45 states participate in the mega millions lottery making it one of the largest and most popular lotteries out there the mega millions lottery began in 1996 with only six states participating over the years the number of states participating in mega millions has increased dramatically to the current 45 plus the district of columbia the rules of the drawing have changed also initially there were five white balls picked from a field of 50 numbers and one gold or mega ball picked from a field of 25 the number of white balls and the number of gold balls has changed over the years as have the odds which have consistently gotten worse for the player buying a chance to my knowledge the mega millions lottery administrators have never really given any justification for why they change the number of white balls and the number of gold balls over the years but i find it curious that these odds generally correspond to the total population of all the states that were participating in the lottery during those years as participation in the lottery increased the odds decreased and the jackpots got bigger calculating the odds of winning a lottery when you have five balls chosen from a field of 70 and one ball chosen from a field of 25 is no easy task but i found a nice calculator on google if you enter 70 choose 5 and multiply by 25 the google calculator can calculate those odds at 302 575 350. those are the exact odds as presented on the mega millions website for winning the jackpot this is a video of a mega millions drawing which takes less than a minute and requires hours of preparation by the mega millions staff i'm going to freeze this at different intervals to show what's going on as we start you'll notice the balls are all displayed up here above the machine so anybody who was observing this could count that there are exactly 70 white balls and 25 gold balls these balls are released into a machine which turns and mixes the balls this is what's known as a gravity feed type of machine because gravity will pull one of the balls down into this little chute and then down into the display area where that becomes one of the chosen balls in preparation for this particular drawing and every other drawing a group of people more than one and this includes an outside auditor which is listed here on the screen very briefly they go into a secure location and they choose these two machines from a pool of many machines and they choose these white balls from a pool of multiple white ball sets as well as the gold ball set that they choose there are multiple gold ball sets so they put all of this together i would imagine they perform tasks like weighing each ball to make sure they're consistent and checking the shape and looking for damage dirt imperfections that might cause a bias in one of the balls and reportedly according to the website they test the machines repeatedly before each drawing so a lot of preparation goes into this drawing these balls are actually reportedly rubber balls so they kind of look like ping-pong balls but they are evidently made out of rubber the white ball machine is set to capture exactly five balls and you can see this little chute moves out at seemingly regular intervals until exactly five balls are shown in similar fashion the gold ball machine chooses one ball and displays it so the question is is there bias in the balls are some balls more likely to fall down than others when you look at the balls there are obvious differences between them some of these balls are marked with double digit numbers and some are just single digit numbers does that extra amount of ink or whatever is used to mark those balls causing differences in friction or weight and making some balls more likely to fall i don't know what we want to do in this video is take a look at the results and see if there's any unusual bias towards certain numbers to do this we'll first look at the standard deviation of number drawings and then we'll look at a chi-squared test i'm going to focus on mega millions drawings that took place in the october 28 2017 to present period where five balls are drawn from a field of 70 and one gold ball from a field of 25 since that's the most relevant period there are many state lottery websites where you can get lottery drawings of the past and download the data but i'm going to use this website i think this is a probably one of the better ones it's called usamega.com and you can get every drawing in history and the statistics on the balls these are statistics based on the 172 drawings from september 6 96 to january 12 1999 that very first period of mega millions drawings so i've downloaded my data for the pick five of the balls numbered one through seventy this is from the most recent period of 2017 to present and you can see that the ball numbered 31 was the most popular ball during this period it was drawn 34 times and if we go down to the bottom you can see that the 21 ball was chosen only 14 times i'm gonna do a quick little sum of this times drawn that should be evenly divisible by five so let's just check that out and we have 336 drawings and that does correspond exactly with the website which tells me there were 336 drawings during this period so the first thing i'm going to do is look at the standard deviation and see if we have anything outside of two standard deviations that would kind of peak my interest about bias so the first thing i'm going to do is hit ctrl a and let's sort this list by the ball numbers so now my balls are numbered 1 through 70 and i'm going to take this side of the data and insert a bar chart and i'm going to rename that something appropriate and if we stretch out this chart a little you can kind of get a feel for what my unpopular numbers are here's 21 and then at the high end we have 31. one more thing i'd like to do with this data is get a summary statistic in excel from it you'll need to go over to here and hit options and make sure that your data analysis tool pack is turned on mine is turned on here if you don't see this here this is an excel add-in so go to your excel add-ins and check this box and hit okay it may take a few seconds to load so what this is going to enable me to do is click here under data analysis i want a set of descriptive statistics for my data it's going to ask me what's my input range and i will select the number of drawings and it's asking me for an output range in which case i am going to select this location right here on my worksheet and we'll just get summary statistics so these are my summary statistics you can see that i have a standard deviation of about 5 and i have a mean of about 24. that mean is in fact exactly 24 because it has to be we have sixteen hundred and eighty drawings of numbers in thirty three hundred and thirty six drawings so if you take sixteen hundred and eighty and divide by seventy you come up with exactly 24. if each ball had been drawn the exact same number of times that number of times would be exactly 24 so that is correct now i like to look to see if i take my mean and i add two standard deviations to it that would be my high end and if i take my mean and i subtract two standard deviations that's kind of my low end do i have anything that falls outside that range do i have anything above 34 well the highest one here is 34 exactly so technically that would fall within two standard deviations and the 13.99 we've got a 14 there and i believe 21 is my lowest number and that is correct so technically that is outside two standard deviations to the left but just barely these are both observations which are right at the two standard deviation line the next test i'd like to perform on this data is a chi-square test sometimes called pearson's chi-square test this is a test intended to determine how likely it is that an observed distribution is due to chance the chi-squared test is also called a goodness-of-fit statistic because it measures how well an observed distribution of data fits within the distribution that is expected if the variables are independent and remember our expected distribution of these picks from the numbers 1 to 70 is 24 times each one general rule with chi-squared tests is that all of these drawings should be at least five since our minimum is 14 it looks like we're okay in that respect our null hypothesis for our chi-square test is that the drawings are unbiased hence each numbered ball is expected to be drawn the same number of times a chi-square test takes into account the number of times numbers are drawn if you were to flip a coin 100 times and get 60 heads you might not be suspicious that that coin is biased for heads but if you flipped it 10 000 times and got 6 000 heads even though that's the same percentage you might be a little more concerned about that coin being biased to complete my chi-square test i'm going to complete this little table here so i is my numbered balls that's just 1 through 70 and the next column over is my obs observations o sub i and that's these times drawn over here so these numbers are just transferred straight over from what we had already now my expected observations as we said is 24 so i'm just going to fill that down this column equals o sub i minus e sub i we'll fill that down and then we want to take that and square it in this column and finally we want to take that result and divide by the expected number of observations down at the bottom i sum up these far right column numbers and this leads us to our chi-square statistic which in this case is 72 so how do we interpret this chi-square statistic of 72 does this support or not support our null hypothesis well what you have to do is go to this table of values and these are upper tail critical values of chi-square distribution with v degrees of freedom the value for v for us will be 70-1 or 69 since we have 70 balls numbered 1-70 you take one less than that so we're interested in this 69 row here let me transfer those values over so since our chi-square statistic of 72 does not exceed this number here we can say with a 90 significance level that our null hypothesis holds if we want to set the bar higher at 99 percent we would have to have a chi-square statistic that exceeds 99 and it gets more difficult to satisfy that so from this little chi-square analysis we can say that it does appear that these number drawings are truly random and you can see what the big contributors were remember 31 and 21 these are the big contributors to our chi-square statistic they make up a lot of it we would need more of those big contributors those outliers to generate a higher chi-square statistic to get over these hurdles and throw out our null hypothesis so what about the mega number that single drawing from a pool of 25 well i repeated the whole exercise offline for the mega balls 1 through 25 and looked up the times they've been drawn here it is presented on the bar graph you can see our low number most unpopular was eight it was only chosen seven times and the most popular one in this case was 22 you would expect each of these numbers to be drawn 13.44 times so we do have some outliers that exceed two standard deviations on this one but there's only two of them how did the chi-square analysis come out on this one well the numbers are a little different we have a chi-square statistic of 16.08 and these values are different as well but it's just as hard getting over the hurdle even the 90 percent significance level would require a 33.196 and we're nowhere near that so this chi-square analysis for the mega ball number also supports our null hypothesis that mega millions mega ball number drawings are truly random when you're talking about multiple machines and multiple ball sets in play and going to such great lengths to ensure the randomness of these numbers i'm not surprised at the result and remember the gambler's fallacy this is a belief that if a truly random event has occurred more frequently than normal during the past it is less likely to occur in the future so if you think you should pick the mega ball number 8 because it hasn't occurred much in the past you are sadly mistaken past events have no impact on future events when you're talking about truly random events based on this analysis we have no reason to believe that mega millions drawings are anything but random i hope you've enjoyed the content of this video and found it interesting and informative please consider hitting my subscribe button i'll be releasing one new video per week for the foreseeable future if you have a comment i'd really like to hear your comments on different problems and excel issues you'd like to have solved thanks for visiting spreadsheet geek so [Music] you
Info
Channel: spreadsheet geek
Views: 1,819
Rating: 4.8139534 out of 5
Keywords: mega millions, excel solver, excel basics for beginners, Excel tips, excel 2016 tutorial, excel tutorial, excel for analysts, advanced excel tutorial, advanced excel training, excel tutorial for beginners, spreadsheet tutorial, microsoft excel, microsoft excel (software), Megamillions Lotto, chi squared test on excel, pearson chi square test, pearson chi square test interpretation, standard deviation statistics, standard deviation excel, standard deviation explained
Id: wolLiawcsv4
Channel Id: undefined
Length: 19min 23sec (1163 seconds)
Published: Sun Jan 17 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.