Statistician Answers Stats Questions From Twitter | Tech Support | WIRED

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello i'm jeffrey rosenthal i'm a professor of statistics at the university of toronto and this is stat support [Music] question from king d weeb why do statisticians get so worked up over probability every event is just 50 50. it either happens or it doesn't this is something i've heard before this idea that well if it can either happen or not it must be 50 50. sometimes that's referred to by philosophers as the principle of indifference meaning that anything that could happen they must all have the same probability the thing is it's just not true when i go home today from the studio i might get killed by a bolt of lightning or i might not get killed by a bolt of lightning i'm pretty sure there's not a 50 chance i'm gonna get killed by a bolt of lightning okay next we have a question from what the fuss who says why is statistics important in life really we're awash in all kinds of different data so anything from you know the spread of disease or crime statistics or studies of a medical treatment or financial data or public opinion polls there's so many facts and figures and statistics out there the science of statistics is a way to try to sort through it so if you don't have any statistical knowledge or understanding or perspective then you're likely to say well this must be true because my friend said it or this must be true because i heard it on the news or i just kind of think it must be true but if you have statistics you can try to analyze all the facts and figures that are out there and try to see what are the real trends what's really happening versus what things really aren't the way people think they are next we have a question from lawrenceitv says question for statisticians why did the polls get it so wrong explanations please yeah so public opinion polling especially when it's predicting elections is a very high profile thing but also a hard thing to do and usually people notice the mistakes more than the corrections so a lot of public opponent for elections has actually been quite accurate and it's predicted things quite well but there have been some high profile misses for example the us presidential elections of 2016 and 2020. now even in those cases typically the polls prediction compared to the actual results was usually only off by about four or five percent which isn't such a huge amount considering how hard it is to figure out what's going to happen but it's still a big enough error that if the election's close it can make a big difference so why is that well election polls of course they don't ask everybody how they're going to vote they just ask a sample usually a few thousand people and then try to figure out what maybe a hundred million people are going to do so that is a challenge the good news is if the polling is done randomly that is we're equally likely to pick every person with the same probability then we have good statistics to allow us to figure out how accurate we're going to be what will be the so-called margin of error you know how close we'll usually be to the true answer and actually that works pretty well but what makes it especially hard for the pollsters is that it's hard to get a random sample and the main reason is because most people don't want to talk to pollsters polling companies don't necessarily like to talk about it but their response rates are usually less than 10 and that can lead to a lot of biases because maybe people who support a certain candidate are a little bit more likely to agree to talk to the pollsters than people who support another candidate and any little response bias like that can have a huge impact on the results question from come on matt think what are some common statistical errors and how can we learn to spot them and if possible correct them in others and our own work one of the biggest things is people don't think about what i like to call the out of how many principle and that's this idea that when something happens that's striking people will compute the probability of it happening in that exact way to that exact person but not look at the chance that it will happen in some way to somebody there was a a woman in england who had two sons who each died in infancy there is something as you probably know called sids or a sudden infant death syndrome so maybe just two times she got really really unlucky and her babies stopped breathing or maybe she was a murderer and she'd actually uh she'd actually suffocated them and she was actually arrested and charged and at her trial they said oh it's so unlikely that there'd be two sids cases in the same family that we can rule that out she must have actually tried to kill them and that's an interesting example where if you just look at the probability given two kids in one family what's the chance they're both gonna die of sids of course it is very unlikely but then if you say out of all the millions of families in the united kingdom or in the whole world what's the chance that somewhere there's a family where two kids both died of sids extremely likely and it seems like that was the case with her there was actually no other evidence that she had actually tried to kill these kids she was just extremely unlucky and yet she was convicted she was jailed she spent several years in jail before there was enough of an outcry and eventually on the second appeal the case was overturned question from josh lev says what's more likely than winning the lottery the short answer is everything that is to say if you're talking about winning a lottery jackpot for one of the big lotteries like mega millions or powerball then the chance of winning that jackpot with a single ticket is one chance in a couple of hundred million depending on which lottery so just incredibly unlikely so compared to that almost anything you can think of being killed by a bolt of lightning or the next person you meet will one day be the president of the united states or any crazy thing you can come up with we can estimate the odds for all of them and they're all more likely than the chance you're going to win the powerball lottery and in fact one that i like to use as an example is if you drive to the store to buy your lottery ticket you're way more likely to be killed in a car crash on your way to the store than you are to win the jackpot next we have a question from s molly mall i'm just patiently waiting for people to realize that all statistics are skewed because the data is skewed in so many ways that i can't even list them all so not a big fan of statistics maybe but that's true there is that's a good point that all data is going to have some things that are wrong with it maybe it was biased maybe it wasn't measured correctly maybe it only shows part of the story but i don't think that means we should just forget about it and just forget about statistics and data i think what it means is we have to think carefully when we get data we have to say how is this data collected is it an accurate reflection of the truth in what ways is it going to be biased or misleading and then we can still draw inferences from it but it's true that we have to be careful we have a question from john friedberg says about to play what must be the absolute worst casino game in terms of rods any guesses well it's an interesting question there's different casinos with different games but one of the games which to my surprise is one of the most popular and also has one of the worst odds against you is the video lottery terminal so people love them but they usually have at least a five percent and maybe 10 or even 15 house edge so they're really not the best game now there are some casino games which have odds which are much better for the player so for example of the pure chance games the game craps where you repeatedly roll a pair of dice kind of like these you have a 49.2929 percent chance of winning uh next question from uh shava kadzi are murder rates skyrocketing or the media doesn't have much to report so they are focusing more on that yeah it's a good question so murder rates have generally been coming down a little bit in the last couple of decades but in the last few years there's been a little bit of an uptick so they're now a little bit higher than they were a few years ago but they're still quite a bit lower than they were a decade or two ago also i've noticed for example politicians and police spokespeople and so on they all will at times say oh crime rates are way up for their own reasons they have reasons for wanting that to be said even though you know maybe it's not actually true so it's just one more reason that if you want to know what's happening with something like you know rates of crime well don't listen to what a few people are saying look at the actual statistics and then you can see the truth next we have a question from brentaclan says how does probability work in the roulettes so it's a good question roulettes are fairly simple so the standard american roulette wheel has 38 of those little wedge slots and two of them are green there's the zero and the double zero and then the others are divided into 18 red and 18 black the person at the casino spins the wheel and presumably it's equally likely to come up any of those 38 different wedges so what it means is if you bet on for example red well 18 out of the 38 wedges are red so you have an 18 out of 38 chance of getting red which is a little bit less than 50 and that's why if you bet on red there's an even money payout but on average you're gonna lose a little bit more money than you win you can also sometimes bet on different things like all the even numbers or something like that but whichever bet you do it works out to the same thing there's a slight edge in favor of the casino and that's why if you play roulette over a long period of time it's going to be more and more sure that you're going to lose more money than you win a question from six latin six lover six who makes betting odds is it an algorithm so it's a really interesting uh problem for the bookies or the people who are making these um odds now the goal is pretty easy to understand because if you're a bookie what you want is pretty much to have the same amount of betting on both sides so that in the end you don't really care if the horse wins or not or you don't really care if the team wins or not because either way you're going to make money because you're going to get your cut whereas if everybody bet on one side and then they all won then you could lose a lot of money but on the other hand how they do that is kind of a challenge and usually they're updating their odds as they go and if they see you everybody's betting on this one tmg we better change the odds so that the next bettors are more likely to bet on the other side and i'm not a bookie but my impression is that in the old days it used to be on just kind of by their judgment or you know experienced people looking things over and tweaking things whereas now there's so much online gambling that a lot of it is automated and they have algorithms which i think are not simple based on how everybody's betting and trying to adjust things but the goal is pretty easy to understand trying to balance out those bets question from xenodotus what is a stochastic process really well i'm glad you asked so stochastic is just another word for random so it means random processes or things that proceed randomly in time and the simplest example is actually one i sometimes like to illustrate with my students using a stuffed frog so i'll do that here and we imagine we have a frog which every second randomly decides either to move one step this way or to move one step this way and once it does then the next second it again decides randomly to move one step this way or one step this way and yet it's actually really interesting for mathematicians to study this what's the chance that the frog will eventually return to where it started turns out it's 100 it's certain it might take a really long time but eventually it's going to return to where it started and in fact eventually it's going to be a million steps that way and eventually it's going to be a billion steps that way it's going to go to every single place eventually if you wait long enough with probability one we can prove that next a question from anna cell x says what does it mean to be statistically significant so statistically significant is saying probably it wasn't just chance that this is enough of an effect that we can pretty much you can never do it for sure but you can pretty much say it's probably not due to chance alone probably this actually shows something real there was really a difference or there was really an increase or something really happened it wasn't just the random luck so the basic idea is pretty simple it sometimes gets lost in the details but when you notice something that happens you know maybe or this classroom did better on the test than this other classroom then as statisticians the fundamental question you're always asking is does that mean something real like oh maybe the teaching was better in this class or maybe people in that class are are you smarter or was it just random luck so you'd never expect any two results to be exactly the same there's always going to be some differences okay next question from john elworthy can someone please help with this what are the odds of having three generations of family members being born on the same day first was born on january 10 1943 the second same day 1994 and the third same day in in 2022. it's actually a good example of the sort of question that there's different ways of looking at the probability so if you just say there's three people one of the chants they'll all have been born on the same day well that's pretty straightforward so you can think well the first one could be born on any day it doesn't really matter then the second one has roughly one chance in 365 of being born on that same day and then the third one has roughly one chance in 365 of being born again on that same day so it's one chance in 365 times 365 which was a little less than one chance in 100 000 i think so uh it's quite unlikely one way i'd like to look at these kind of questions is this is sort of out of how many different ways that this could have happened so even in this one family probably there's a lot of other people in each of those generations and if any three of them had matched up their birthdays then this same tweet could have been written so right away the chance is a lot bigger because there's lots of different combinations which all could have led to the same conclusion it's not incredible that it happens but it's still pretty cool when it does happen to you from adjayo cie says how best can a statistician explain p-value to a non-statistician yeah so that's a good question the basic idea of a p-value is the idea of what is the probability that the thing you just observed would have happened just by pure chance if there was no true effect if we look at let's say you know we have some people with a disease and we give them a new treatment and then a certain number of them get better do you say oh well that means a new treatment really helped well no because some of them would have gotten better even without this new treatment maybe more of them got better than you'd expect on average from the new treatment yeah but how much more and the p-value question would be what's the probability if we hadn't given any treatment that that same number or more of the people would still have gotten better and if that p-value is pretty high you know maybe there was there was a 40 chance that they would have gotten better even without the treatment we haven't really proven anything and the typical standard is that if the p-value is less than five percent or less than one chance in 20 and we say okay it's pretty unlikely that they all would have gotten better if it hadn't been for this new treatment so this provides some evidence that the new treatment is helping but if the p-value is larger it doesn't okay so next a question from king ambuso says statistically what are the chances and right and this is a display of a draw results and i believe this was from the south africa powerball lottery back in december of 2020 and what happened was a little surprising so of the main numbers there were five numbers chosen in a row five six seven eight nine and then the bonus powerball number chosen was a 10. so we had six numbers all in a row for the draw seemed very surprising so you could say what are the chances of that happening well the rules of the south africa powerball then where you choose five numbers between 1 and 50 and then a bonus number between 1 and 20. so you can say how many different ways could you get them all in a row like that well the first five numbers would have to be five numbers in a row starting with something from one two three up to 15 really so that's only 15 ways and then the powerball number would have to be the next one so there's a very small number and then when you divide that by the total number of different ways you could have chosen those five balls plus the one bonus thing there's many more of those so when you divide it you get that there's a little less than one chance in two million that such a sequence like that would have come up question from chris masterson is it statistically less likely to be in a plane crash if you've already been in one well no and of course the answer is no if you think about it how could it be you know how could this new plane know wait a minute there's somebody on here who was on another crash so i better not crash this time that's just not the way science works it's not the way airplanes work it's not the way pilots work but a lot of people will think that and the reason people think that is because it's very unlikely any one person is going to be on two different planes that crash right that's really bad luck but once you've already been on one that was very unlucky but now it doesn't have any effect on the probability of the next plane they're what we call statistically independent events so neither one affects the probability of the other so a question from a tetraform says hey what is the most statistically improbable thing to happen to you well when i was in my early teens my family went on a trip to disney world florida and in the middle of it all we looked up and we saw my father's cousin phil and he lived in connecticut at the time and we lived in toronto canada and we had no idea he was going to be there he said you know what are the odds that out of all of the hundreds of millions of people in the united states and all the people that visit disney world that my dad's cousin would be there it's a good example that on the one hand if you just say what's the chance that one guy would be my dad's cousin phil it's incredibly unlikely but as with a lot of things if you take the bigger picture you can say well my dad's cousin phil isn't the only person we would have been so surprised to see what about my dad's other cousins or my mom's cousins or my cousins or my piano teacher or my friend from school or there's probably a few hundred people that we would have been really surprised to see and then you say well we were at disneyland for a couple of days and we went on lots of different rides and so on and we probably saw thousands of people and just one of them was my dad's cousin phil the other ones were other people so it's actually not so unlikely and i end up computing there's about one chance in 200 or so about half of one percent that if you go on a trip to disney world and spend a couple of days there on all the rides that you run into somebody that you know so it's not so incredible even though it sure was a surprise at the time okay so i think that's all the questions for today i hope you learned something and i hope i'll see you again
Info
Channel: WIRED
Views: 2,218,994
Rating: undefined out of 5
Keywords: innovation, jeffrey, jeffrey rosenthal, jeffrey rosenthal stats, ott tech support, rosenthal, science & technology, statistician tech support, statistician wired, statistics, statistics explained, statistics professor, statistics questions, statistics wired, statitician, stats, stats professor, tech support, university of toronto, wired, wired statistics, wired tech support
Id: QW3KRaz4aI4
Channel Id: undefined
Length: 16min 50sec (1010 seconds)
Published: Mon Feb 21 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.