How Not to Fall for Bad Statistics - with Jennifer Rogers

Video Statistics and Information

Captions Word Cloud
[MUSIC PLAYING] They just weren't very good. Anyway, here we go. So "Living is a Risky Business," I'm going to be talking today about risk. We are bombarded every single day with headlines full of all of these things that we should and shouldn't be doing in order to live a longer life. You know, dementia, sitting too long may increase a middle person's-- middle-aged person's risk. And sometimes these even have numbers on them. So an everyday painkiller doubles the risk of heart attacks and strokes. A child's risk of brain cancer triples after just two CT scans. And we're supposed to use these headlines to inform our day-to-day lives and to help us make decisions as to how we should be living our lives. But how do we actually make sense of these numbers? So throughout this talk, I'm going to give you a little bit of a tool box as to all of the things that you should be thinking about, all of the questions that you should be asking yourselves, when you see these sorts of headlines in the newspapers. But first of all, I thought I would try and get an idea as to how good you guys are at understanding risk. Do you know what risky activities are? So I'm going to do a bit of a survey. I can't help it. I'm a statistician. We can't help but do surveys. So I've got some different risk scenarios. And I want you to tell me which is the most risky. So the first one is I'm going to ask you which is the most dangerous animal? So which animal causes the most deaths? So I'll show you both options. And then I'm going to ask you to vote for me. So is it crocodiles or is it hippos? So if you think it's a crocodile, give me a cheer. [CHEERS] It's about half a dozen, dozen people think crocodile. If you think it's a hippo, give me a cheer. [LOUD CHEERS] OK, that's overwhelmingly in favour of the hippo. I can tell you it's not looking good because-- [LAUGHTER] --crocodiles, according to the World Health Organisation, cause 1,000 deaths a year compared with 500 from hippos. OK, so I got two more. Let's see if you can redeem yourself. Which is the most dangerous sport? So which causes the most accidents? Is it baseball or is it cheerleading? So give me a cheer if you think it's baseball? [CHEER] Give me a cheer if you think it's cheerleading. [CHEER] Now, it's a little bit more 50/50, probably slightly in favour of the cheerleader. So you have slightly redeemed yourself. Cheerleading does cause more accidents than baseball. OK, last one, which is the most dangerous mode of transport? Is it riding a bike or driving a car? So give me-- give me a cheer if it's riding a bike? [CHEER] Give me a cheer for driving a car. [CHEER] That's pretty 50-50 actually. OK, let me do that one more time. Riding a bike? [CHEER] Driving a car? [CHEER] I'd say that's about 50/50. I can tell you that actually riding a bike is more dangerous than driving a car, 550 deaths per billion hours versus 130 deaths. This is a really interesting one though, because it raises all sorts of questions as to how do we even measure risk? I've chosen to measure this based on the amount of time a person would spend doing each activity. Some people may choose to think about the number of accidents per miles travelled. But cycling still comes out as more dangerous. Some people might think to do this, the number of journeys that you take. If we think about other modes of transport, such as flying in airplanes, the risk isn't constant the whole time that you're up in an airplane. It's more risky as you take off and as you land. So even just how do we measure risk is a really interesting question in itself. So now we've established that you're, OK, at risk, I think that's a fair assessment. As I said, I want to talk to you about risks that you see every day and give you a toolbox as to everything you should be asking. And I want to start out by talking about the humble bacon sandwich. [LAUGHTER] Now, according to the headlines bacon is one of the worst things you can be eating. It causes all sorts of different types of cancer. This headline here, say in "Daily fry-up boosts cancer risk by 20%." So if you eat bacon on a daily basis, you increase your risk of pancreatic cancer by 20%. And that's a shocking statistic, that actually caused bacon sales to plummet. Is it really though something that we need to be worried about? Crucially, when we see headlines like this, they're actually giving us what we call relative risks. They're only telling us what the risk is in one group relative to another. So I know that if I eat bacon, I've got a 20% increased risk compared to those who don't eat bacon. But I don't know anything about what the risk actually is. And that's where absolute risks come in. So absolute risks depend on the numbers that are associated with each of these things. So how do I take a relative risk and turn it into an absolute risk? First of all, I need to know what my chances are of getting pancreatic cancer. And according to Cancer Research UK, we have a 1 in 80 lifetime risk of pancreatic cancer. So what does that mean? That means if we were to take 400 individuals who didn't eat bacon, we would expect five of them to get pancreatic cancer anyway. My flicker has decided to stop working. So if we then look back at our headline, our headline says that our daily fry-up boosts our cancer risk by 20%. 20% is a fifth. And what's a fifth of 5. It's just one, meaning that our risk goes from five in every 400 individuals to six in every 400 individuals. It's only an extra one person in every 400. So whilst that 20% increase sounded really scary, a headline that said it increases your risk an extra one person in every 400 wouldn't sound anywhere near as scary or anything. You didn't need to be worried about it. There was also a headline that said that bacon, ham, and sausages were now as big a cancer threat as smoking, the World Health Organisation were to warn. Now, the reason for this is that the World Health Organisation produces these lists of known risk factors for different types of cancer. And smoking was already on there as a known risk factor for lung cancer. And they were saying that because this processed meat was now being added to that list for the first time, it meant that they were as risky as each other. Now, these lists are based on something called statistical significance. And statistical significance just tells us whether or not something definitely does or definitely doesn't cause cancer. It doesn't quantify that risk in any way. So how do the risks for smoking and lung cancer compare to those risks that we've just seen for bacon and pancreatic cancer? So if we take our 400 individuals again, if you got 400 people who don't smoke, you would expect four of them to get lung cancer anyway. If you smoke 25 or more cigarettes every single day, that goes up 24 times, to 96 in every 400 individuals. So that's an extra 92 in every 400 compared to that extra one in every 400 for the bacon and pancreatic cancer. So, yes, they may both be statistically significant in causing cancer. But to say that they now were as big a cancer threat as each other, they were as risky as each other, is absolutely ludicrous because we can see that there is a huge difference in the risks. What we also need to think about when we see these headlines is that it compared those who eat bacon every single day with those who never eat it. And the risk was only increased by an extra one person in every 400. If you only eat bacon, say, once a week on a Saturday morning as a treat to yourself, it's going to have an even smaller effect. And so it's going to be absolutely tiny, this risk, Plus, what you also need to think about is if you're eating bacon for breakfast every day, you're not eating fruit for breakfast every day. You may be more likely to have an unhealthy lifestyle in general. And how do we know that it's the bacon that's actually causing this increased risk of cancer, and it isn't another one of these unhealthy lifestyle factors instead? And what we say in statistics is that correlation doesn't always mean causation. Here are some of my favourite headlines for demonstrating correlation versus causation. So, yeah, fizzy drinks make teenagers violent. Fizzy drink a day make teenagers behave aggressively. Children drinking fizzy drinks are regularly more likely to carry a gun. [LAUGHTER] Now, it could be that drinking fizzy drinks makes teenagers violent. Or it could be that there's some other social demographic factor that means a teenager is more likely to have fizzy drinks in their diet. And they're also more likely to be violent. Or it could be that being violent is thirsty work. And at the end of it, you want a fizzy drink. [LAUGHTER] We don't know which way around that relationship goes. One of the best ways to think about correlation versus causation is if you think about ice cream sales. As ice cream sales go up, so do the number of drownings. [LAUGHTER] So does that mean that ice cream causes drownings? Both of these things are affected by something else, hot weather. As the temperatures increase, we eat more ice cream, as temperatures increase, we go into the sea more often, meaning that there are naturally just more drownings. Once we actually take that into account, that direct relationship between ice creams and drownings disappears. We call this a confounding factor. And once we account for the confounding factor in our analysis, then that direct relationship between these two things disappears. There's a really nice website that I like to go on, that allows you to correlate all these weird and wonderful things with each other. So I got on there. And I've picked out some of my favourites. If you google spurious correlations, it's the first one that comes up. You can have a lot of fun with it. I spent too much time trying to find interesting correlations to my presentations. But there is a 98% correlation between the amount of money spent on admission to spectator sports and the number of people who died by falling down the stairs. So does this mean as we're all spending money to go to these big arenas, we're all falling down the stairs at the same time? I don't know. There is a 94% correlation between maths doctorates awarded and the amount of money spent on pets. Now, I'm a dog lover. Am I a dog lover because I'm a mathematician? I don't know. My absolute favourite though is that there is a 95% correlation between the per capita consumption of cheese and the number of people who died by becoming tangled in their bed sheets. So does this mean that we shouldn't eat cheese before we go to bed because we might die? These are two things that are obviously just-- they just happen to be correlated with each other. And it doesn't mean that one is causing the other. Now, you might be saying to yourself, this is all well and good. This is all very funny. I know that cheese doesn't cause death by bedsheet. When do I actually really need to think about this in real life? I was asked to comment on a story that was run by the BBC-- and it was January 2017, so just over two years ago now-- that said, live nearing a busy road increased your risk of dementia. It apparently increased your risk of dementia by 7%. And the BBC got in touch and wanted me to comment on it. And they wanted me to talk about this relative versus absolute risk. So I went and I had to look at the paper-- it was published in The Lancet-- just to try and get an idea as to what the absolute numbers might have been. And whilst I was looking at this study, I realised that they hadn't controlled for family history in their analysis. And I argued that we know there's a huge family history element to dementia. But you could also argue that there's a family history element as to where you might live. If you grew up in the middle of the countryside, you might be more likely to continue living in the countryside as an adult. If you grew up in a big city, you might be more likely to live in a big city as an adult. And so you've got this big family history element to dementia and family history element as to where you might live. And I said the fact that that wasn't accounted for in the analysis was a major let down in the study. Also while I was looking at this, I looked at the supplementary material. And it looked to all of these other things that they had looked at, that might be associated with dementia. So this top row here are all of these different factors that they thought might be associated with dementia. So, for example, where you can see that smoking, that 1.3 means that smoking increases your risk of dementia by 30%. The obesity, so obese versus a normal weight, increases your risk of dementia by 64%. And yet the newspapers had chosen to really focus on this living near a busy road increasing your risk of dementia by 7%. And I said, you know, before you went to pick up sticks and move to the countryside, there are lots of other things that you could do that would have a bigger effect on your risk of dementia, quitting smoking, losing weight, and higher education versus lower education. But the newspapers just chose not to report any of this. And so what I would always say to you when you see these headlines, have a little think to yourself, what aren't they telling us? What else could be going on? So I'm asked quite a lot actually to comment on stories that appear in the press. And this was another one that I was asked to comment on at the beginning of last year, that said that 2017 was the safest year for air travel as fatalities fall. So in 2017, there were no deaths anywhere in the world caused by passenger jet crashes. And early into 2018, there then was a passenger jet crash. And there was all this sort of investigation as to has everything gone wrong? Was 2017 the safest year we're ever going to have? Do we now have to start investigating to try and figure out what's happened? And so I was asked to comment on this story. And I want to do a little demonstration with you. So there are dice in these two rows. Some of you will have been given dice or had dice underneath your seat as you sat down. Could you just wave you dice subtly, please? So we're going to do a little demonstration with these dice. And we're going to do a little demonstration that thinks about these things, speed cameras. So everybody's favourites, I know. So when speed cameras first came in, the government needed to give some serious thought as to where they might put them. We couldn't put speed cameras absolutely everywhere. Was there some sort of sensible strategy that we could adopt, to decide where we might put those speed cameras? And we're going to recreate that exercise now. And what the government decided to do was to try and identify all the accident hotspots. And they said that those were obviously the most dangerous places. And those were the ones that were in the highest need of getting these speed cameras. And we're going to recreate that now. So all of you who have got the dice, in a second I'm going to ask you to roll them. I'm very aware of the fact that there's not that much room. So can I just suggest you give them a good shake in your hand and just drop them on the floor. But, yes, so I'm going to ask you to do that. I want you to do it twice. And count up the score that you get. And then we're going to decide where we're going to put our speed cameras. So if you could all roll them for me twice, that would be marvellous. There should be quite a few more on that row actually. There should be more than one. Oh, you've handed them back. OK, that's fine. OK, right, so did anybody get a 12? Anybody get an 11? Oh, I heard a twitch then. Anybody get a 10? I only got 1. OK, right, what we do is we're going to redo this. I usually have more dice. I gave this talk last week to a group of teenagers. And they stole my dice because they all like to take souvenirs. So I'm now doing a probability problem with fewer dice than I normally would have. So bear with me. And we'll just repeat that again. So if you could all just give them a really good roll and repeat that. Do it twice for me. And we'll see what we get. This is what happens when you present to teenagers. They like souvenirs. OK, right, how about this time. Did anybody got any 12s? Any 11s. We've got two, OK, right, brilliant. 3? 3? Sorry, it's the lights. OK, so I'm going to give you a speed camera. So we got a speed camera there. You happen to be all spread out as much as you possibly could be. Oh, thank you very much. And a speed camera over here. Can I just ask you to pass that back behind? That'll be brilliant. Thank you. So now we've got our speed cameras in place. We now need to see if they've worked. So what I want you to do is to all repeat the same thing again. Last time, I promise. If you could all just roll your dice for me again, twice. OK, where we have speed cameras, what did you get this Time 6. 6. 8. 8. 7. 7. So we can see where we've got our speed cameras we've seen a reduction in the number of accidents-- [LAUGHTER] --meaning our speed cameras have worked right? Maybe not. OK, so this is a really nice demonstration of what we call regression to the mean. So what do I mean by regression to the mean? Regression to the mean is we all behave according to some sort of average. But we don't get exactly the same value every single time. We have these random fluctuations around it. I simulated here a number of accidents, where I kept the average a constant 10 all of the time. But you see that I don't always get 10 accidents every month. Sometimes I see higher than that. Sometimes I see lower than that. And this is what you would expect to see just by chance. And crucially, it was one of these random highs where I actually chose to put in my intervention. You didn't have a higher chance of rolling a high number that then changed when I gave you a yellow hat. I chose to put the interventions in those places that were randomly high the first time around. And regression to the mean tells me that I would have expected them to be lower the next time around just by chance. It's regressing to the mean. So regression to the mean tells you I've got something lower the first time around. I expect it to be higher the next time around. And this is exactly what happened when speed cameras came in. The government put them in those places with the highest number of accidents. They saw there was a reduction. And then they said that the speed cameras must have been working. And it took a group of statisticians to come in and say, actually, you need to be looking at this over a long period of time to be able to say whether or not the average is changing through time, whether or not the average number of accidents is actually decreasing. So we see regression to the mean of the time in sports. If you think about your favourite sports teams, they'll go on random winning streaks. And they'll go on random losing streaks. When they go on losing streaks, they sometimes sack their manager. And a new manager comes in. And they say, oh, look, they're winning again, must be the manager. A lot of the time, it can be explained by regression to the mean. That losing streak is just a random low. And when they bring the new manager in, they just go back to their average performance level. And some research has shown that actually those teams that stick with their managers see the bounce back in form much quicker than those that actually bring in new managers. There's something called the Sports Illustrated curse, that says when you appear on the cover of Sports Illustrated, it's a curse. You then go on to perform really badly. But it can be explained by regression to the mean. If you think about what does it take to appear on the cover of Sports Illustrated, you have to be at the very top of your game, which is going to be a combination of your natural ability. But you're probably also going to be riding one of these random highs as well. And this curse isn't necessarily a curse. It's just you then-- that random high coming to an end. And you're going back to your average ability. So I argued that when we looked at this story here, all of this could be explained by regression to the mean. We would expect the number of air crashes and fatalities to remain low. But we are going to see these fluctuations around it. Some years, we're just naturally going to see slightly more. And some years we're just going to see slightly less. And the fact that there have been none in 2017, and then one in 2018, didn't necessarily mean that everything had gone wrong. And we all of a sudden needed to be having these big investigations. There were also stories about this time last year looking at the London murder rate now beating New York, as stabbings surged. And there was a question as to whether or not London was now a more dangerous city than New York. And BBC's Reality Check actually looked into this. So this is a really good resource, that the BBC's Reality Check. So they get statisticians to look at these sorts of claims. And the claim was that London it overtake overtaken New York for murders. And it was now more dangerous. And they found that a selective use of statistics over a short period of time appeared to bear it out. But the reality was that New York still appeared to be more violent than London. If you looked at it over a longer period of time, then New York did appear to be still more dangerous than London. So while we're on the topic of airplanes, I-- a vice president of the Royal Statistical Society. And some of the work that I do with the RSS, I have a bit of a hobby. And I like to give Ryanair a headache. [LAUGHTER] So it first started out when I was approached by BBC's Watchdog. So Ryanair had changed their seating allocation algorithm. It used to be that if you'd booked as a group, when you checked in you would all get to sit together. And then they changed it and they said if you didn't book seats together and pay for them, you would be randomly scattered throughout the plane. And loads of people started complaining to Watchdog, saying that they thought there were too many middle seats being given out. So the window seat is quite desirable because you get the nice view. The aisle seat, you get a little bit of extra legroom. The middle seat is seen as the least desirable seat. But everyone seemed to be getting them. And I thought that might be something going on there. So they decided to send four of their researchers on four flights. And on every single one of the flights, they were all allocated middle seats. And they got in touch with me and said, hey, what's the chances of that happening, if the seating allocation is truly random? So I did some stats for them. And then I went on TV and I told them what I found. So it wasn't actually a very complicated calculation that we did. They sent me the information available at the time of check-in for each of the four flights. So this is an example of one of them. So when they checked into their flights there were 23 window seats, 50 middle seats, and 27 aisle seats available, so a total of 65 seats. And using this, I can then work out the probability that they're all given middle seats. So the probability that the first person gets a middle seat is 15 over 65 because there's 15 middle seats and 65 seats in total. The probability then that the next person is given a middle seat is 14 over 64 because there's now 14 middle seats available from 64 seats in total. And I carry on. So the probability of the third person is 13 over 63 and the fourth person is 12 over 62. And if I multiply all of these together, that gives me the probability of all four being middle seats. And it's about 0.2%. Which is one in 500, which isn't actually that small a probability if you think about how many flights Ryanair have every day. One in 500 of this, it's not too surprising. But this is just one flight. As I said, they did it on another three flights. And they all got middle seats on those three as well. So I did the same calculations for the other three flights. And then I combined it altogether. And I found out the probability of all four researchers get middle seats on all four flights was around 1 in 540 million. So you were more than 10 times more likely to win the national lottery than you were for this scenario to happen. But, you know, tiny probabilities don't necessarily mean rare events. So I went and had a look at Ryanair's facts and figures. And they say that they only carry 130 million annual customers. So I was pretty convinced that not only was this a small probability, it was a rare event. And I was suspect as to whether or not there was something going on with their algorithm. Now, they said, you know, we've got our stats. You've got yours. My stats are right, thank you very much. But anyway-- And it all kind of-- it kind of died a little bit of a death. We got some media attention. It was in the newspapers, but then not really very much happened. Until a couple of months later, when 12 women all went on holiday together. And they all got middle seats. And they called the Telegraph. And the Telegraph then called me and said, hey, we heard you did some work on this. What are the chances? So I went through everything that I'd done on the Watchdog story. And they got in touch with Ryanair. And at that point, Ryanair admitted that they'd been lying. They admitted that they actually kept window and aisle seats. They held them back when randomly allocating the seats because those were the ones that people were most likely to pay for. So this random allocation wasn't a random allocation throughout the whole plane. It was a random row within middle seats that you were actually getting. And so, yeah, I was really happy that I managed to get Ryanair to admit to their customers that they'd lied. And I also managed to upset them in the process because they didn't think that the negative media attention, including the BBC investigation, was warranted. So let's all feel sorry for Ryanair. However, they are the gift that keeps on giving. So last April, they released the results of a customer satisfaction survey. They said the 92% of their customers were satisfied with their flight experience. I thought, really? I'd been on a Ryanair flight, 92%? So I decided to take a look at the survey. Now, bear in mind this was an opt-in survey. So my argument was you're only going to opt into a survey if you're really satisfied with your experience and you want them to know about it or you're dissatisfied with your experience and you want them to know about it. This was a survey that they asked people to fill in. So this was the 92% here. But if we look at the options that people got when filling out this survey, they went from excellent to OK. So if you were dissatisfied with your Ryanair experience, there was no way of expressing that dissatisfaction at all. And I argued that you just then wouldn't carry out-- you just wouldn't fill out the survey. You'd just exit, switch it off, and you'd disappear. So basically what you're asking was a group of satisfied Ryanair customers just how satisfied they were with their Ryanair experience? And then we were really surprised that once you combined three of the columns, you got a high percentage. So I went into the Times and I said as much. And they had a really grown up response, where they said 95% of Ryanair customers haven't heard of the Royal Statistical Society. [LAUGHTER] 97% don't care what they say. And 100% said it sounds like their people need to book a low-fare Ryanair holiday. I mean, the stats in that are wrong because if 100% say we need to book a low-fare holiday then 100% of them have heard of us. So the stats are wrong to start off with. But one of the members of the Royal Statistical Society noted that there were 130 million annual Ryanair customers. And if 5% of them had heard of the Royal Statistical Society, that meant that it was 6 and 1/2 million Ryanair customers who had heard of the ROYAL Statistical Society. And to be honest, we'd probably take that. But there we go. So, yeah, I like to-- I like to give Ryanair a headache as a hobby. It's quite fun. Interestingly, though, my boyfriend is currently at the end of his training to be a pilot. And Ryanair is one of the big options that he might want to work for. So that's irony right there. So as I said, I'm a member of Royal Statistical Society. And one of the big projects that we've got for this coming year is actually trying to improve data ethics in advertising. So why is this such an issue? I'm going to play you a little advert. And we're going to talk about adverts in a little bit more detail. [VIDEO PLAYBACK] [MUSIC PLAYING] - Pearl Drops cleans. Pearl Drops whitens. Pearl Drops protects. Pearl Drops shines. Pearl Drops 4D Whitening System, not only whitens, but cleans, shrines, and protects, too. Ultimate whitening, up to four shades whiter in just three weeks. Pearl Drops Tooth Polish, go beyond whitening. [END PLAYBACK] So we're used to seeing these all the time. And we're to seeing these things all of the time, at the bottom of them. So there's some survey that's been done. And so many people agree. Now, there's lots of things wrong with this. First of all, agree with what exactly? I mean, there are a lot of claims in the advert. It cleans, it whitens, it brightens. Which of these exactly are they agreeing with? But a lot of time when people hear I'm a statistician, it's like, oh, adverts, can't trust anything, can you? You can't trust any of the stats in there. They use such small sample sizes that none of the results are reliable. And I want to talk about this in a little bit more detail. So when you see this 52% of 52 people agreed, what should you be thinking about when you see this? So I want to talk a little bit about uncertainty. What do I mean by uncertainty? So if I was to take 10 people and line them up here and ask them to flip a coin 10 times, I know that there is a 50/50 chance of getting a head or a tail. But if they all flipped it 10 times, they wouldn't all get five heads and five tails. Some people might get six heads. Some people four heads, some people might get 10 heads. That's what I mean by uncertainty. In statistics, we talk about the difference between probability theory and statistical inference. So in probability theory, we know the underlying probability. And yet we see noisy data when we do experiments. So that coin example, I know the underlying probability is 50/50. But I see noisy data when I do different experiments with it. A lot of the time in statistics what we're actually trying to do is to go the other way. And we're trying to use samples of data that we know are noisy and subject to uncertainty and use that to tell me something about what the underlying probability is. So there I had 52% of 52 people agreed. If I'd taken a different 52 people, I wouldn't have seen exactly 52% agree. And I'd have seen a slightly different number. And a different 52 people would have given me a slightly different number again. And ultimately, what I'm trying to do in statistics is to take that sample and that piece of data that I know is noisy and subject to uncertainty, and use that to tell me something about what the underlying probability is on a population level. So what I'm trying to do is I'm really trying to create a hypothesis test to see whether or not that percentage that I'm seeing is statistically significant? So what do I mean in hypothesis testing, how do I carry that out? What I do is I formulate what we call a null hypothesis. And a null hypothesis would be that the observations are a result of pure chance. So my underlying probability of people agreeing is actually just 50/50. It's all just down to chance. And what I then say is let's assume that that's true. Let's assume a null hypothesis is true. Let's assume that the data I'm seeing is just random. And it's just by chance. What then is the probability of me seeing the data that I've seen or seeing something at least as extreme as what I've seen? So let's break that down. I understand that's quite a lot to get your head around. So let's break that down for this particular example. So my null hypothesis in this example would actually be 50%. I'm assuming that these people have got a survey that say, you agree that this toothpaste whiten your teeth, yes or no? What pure random would be if they just randomly ticked yes or no. And so across my sample, I would expect it to be about half yeses and half nos. That would be what would be my pure random just by chance. So 50% would be what would correspond to my pure chance. So then if I had, going in the direction of agreeing or going in the direction of disagreeing, that's actually giving me information. That's telling me some of the people have got an opinion as to whether or not they disagree or agree with that statement. And I've got my 52% here. And I know that that is subject to uncertainty. So as I said, I know that if I took a different 52 people, I'd get something slightly different from this. So what I can do is I can put a confidence interval on this. And this confidence interval is related to the sample size. And it tells me, OK, 52%, that's the best estimate as to what the true underlying probability might be. But what could it be? What values could it possibly take? And a confidence interval then gives me a range of values that might be plausible. And as I increase my sample size, I actually decrease the amount of uncertainty. And I decrease-- I make my confidence interval smaller. And what we're looking for in hypothesis testing for a statistically significant result is we want that confidence interval to not cross the null hypothesis. So my null hypothesis here was 50%. I want my confidence interval to not cross that 50%. If it doesn't cross that 50%. I say it's a statistically significant result. If it does cross that, then I say I haven't got enough evidence to say, actually that it's just pure chance. It could just be pure chance. So that confidence interval really matters when I've got something close to the null hypothesis because that 52% is really close to that 50%. I'm going to need quite a big sample size to be able to make that confidence interval small enough so that it doesn't cross that 50%. If, on the other hand though, I had to get a result that was a lot further away from that 50%, I wouldn't necessarily need to have as big a sample size because it's not as close to that null hypothesis of 50%. It doesn't matter if the confidence interval is wider. So yes, when you see these surveys that are being done on small samples, it's not always a problem. It depends on how big your result actually is. It's not just the sample size in itself, but it's also what we say, you know, the effect sizes as well. So just back to our example, we have 52% of 52 people agreed. A 95% confidence interval on this based on these 52 people is 38 to 66. So 95% means if I was to repeat this a hundred times, I would expect 95% of them to between 38 and 66. And it crosses that 50% mark. So here, this is no different from just pure randomness. This is no different from people just flipping a coin, saying yes or no, I agree with that statement. If I was to take another one that said 74% of 54 men agreed with some statement after 28 days, the confidence interval on this is 60 to 85. So it's a similar kind of sample size. We've got similar sized confidence interval. But because our treatment effect was 74% to start off with, and that's a lot further away than the 50, I have enough evidence here to say that there is a difference. And actually people do have a preference. And I'm just going to finish off now with a couple of final graphics, that just say-- because I don't think uncertainty necessarily has to be very difficult to communicate. I think-- [LAUGHTER] When we look at the weather and we look at when they tell you your probability of rain, I mean, these numbers are ridiculous. So what is it? I mean, at 3 o'clock we have a 10% chance of rain. That goes up to 13% at 4 o'clock and 16% at 5 o'clock. What am I supposed to do with this information? I don't know what the uncertainty is on that. And they're really precise point estimates. But it would be super-easy to communicate the uncertainty using some sort of graphic. Now, graphics have the ability to do great good. They also do have the ability to do great evil. And I just want to finish off with a couple of my favourite bad graphics because it is something you really need to watch out for when you're looking at stats in the media. So there's this one, which is one of my favourites. This is the presidential run. So a pie chart should sum up to 100%. [LAUGHTER] This doesn't. And they've obviously here asked would you back this person, yes or no? And then thought that a pie chart was the most appropriate way to communicate that information. This one, I've got no idea what they asked. Half of Americans have tried marijuana today? I'm not-- I don't know if I believe that. But if 43% of them have tried it in the last year, which includes today, how have 51% tried it? The numbers are all wrong. I can't figure out what's going on. They have however, though, included uncertainty. We know it's plus or minus 4%. But I've got no-- I've got no idea. This one from the Office of National Statistics is a very sneaky one. And one that shows you always need to look at the scale of the plot because actually this is a increase in GDP. And it went from 0.6% to 0.7%, they're predicted growth upgrade. And it's a 0.1% increase. And that looks a lot bigger on that plot. And if we look at the axis along the bottom-- and look at the scale of that. If you zoomed out onto that plot and looked at the whole percentage line, it would be just a minuscule difference. So look at the scale. And my last one is my favourite, from Ben and Jerry's. I don't know what world we live in where 62 is smaller than 61. But I don't want to live in that world. Here is an example where Ben and Jerry's have got a story that they want to tell. And the numbers didn't quite agree with that story. So they decided to produce a graphic that told that story anyway. And hoped we wouldn't look at the numbers in enough detail. Look at the size of that 24% compared to that 21%. They've got a clear story that they wanted to tell. And if you were just flicking through a magazine, you might not necessarily look at the numbers in as much detail. So yes, so very, very naughty from Ben and Jerry's. So if you look at stats in the media, I would encourage you think, relative versus absolute risks, correlation versus causation. Could this have happened just by chance, this regression to the mean? Yeah, eat bacon. But don't eat cheese before you go to bed. Thank you very much. [APPLAUSE]
Channel: The Royal Institution
Views: 44,672
Rating: 4.8481355 out of 5
Keywords: Ri, Royal Institution, stats, statistics, correlation, causation, risk, relative risk, absolute risk, data, fake news, bad data
Channel Id: undefined
Length: 42min 20sec (2540 seconds)
Published: Wed Aug 07 2019
Reddit Comments

Sounds like she's been sent out to get us to ignore risks, especially when avoiding that risk will cost a capitalist money. After all, what does throwing dice have to do with car accidents, with or without cameras? Nothing.

Or as XKCD says: Correlation doesn't imply causation, but it does suggestively wiggle its eyebrows and say "look over there".

👍︎︎ 1 👤︎︎ u/alllie 📅︎︎ Aug 09 2019 🗫︎ replies
Related Videos
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.