[MUSIC PLAYING] They just weren't very good. Anyway, here we go. So "Living is a
Risky Business," I'm going to be talking
today about risk. We are bombarded
every single day with headlines full
of all of these things that we should and
shouldn't be doing in order to live a longer life. You know, dementia, sitting
too long may increase a middle person's-- middle-aged
person's risk. And sometimes these even
have numbers on them. So an everyday painkiller
doubles the risk of heart attacks and strokes. A child's risk of brain cancer
triples after just two CT scans. And we're supposed to use
these headlines to inform our day-to-day lives and to
help us make decisions as to how we should be living our lives. But how do we actually make
sense of these numbers? So throughout this
talk, I'm going to give you a little
bit of a tool box as to all of the things that
you should be thinking about, all of the questions that you
should be asking yourselves, when you see these sorts of
headlines in the newspapers. But first of all, I
thought I would try and get an idea as to how good you
guys are at understanding risk. Do you know what
risky activities are? So I'm going to do
a bit of a survey. I can't help it. I'm a statistician. We can't help but do surveys. So I've got some
different risk scenarios. And I want you to tell me
which is the most risky. So the first one is I'm
going to ask you which is the most dangerous animal? So which animal causes
the most deaths? So I'll show you both options. And then I'm going to
ask you to vote for me. So is it crocodiles
or is it hippos? So if you think it's a
crocodile, give me a cheer. [CHEERS] It's about half a dozen,
dozen people think crocodile. If you think it's a
hippo, give me a cheer. [LOUD CHEERS] OK, that's overwhelmingly
in favour of the hippo. I can tell you it's not
looking good because-- [LAUGHTER] --crocodiles, according to
the World Health Organisation, cause 1,000 deaths a year
compared with 500 from hippos. OK, so I got two more. Let's see if you
can redeem yourself. Which is the most
dangerous sport? So which causes
the most accidents? Is it baseball or
is it cheerleading? So give me a cheer if
you think it's baseball? [CHEER] Give me a cheer if you
think it's cheerleading. [CHEER] Now, it's a little
bit more 50/50, probably slightly in
favour of the cheerleader. So you have slightly
redeemed yourself. Cheerleading does cause more
accidents than baseball. OK, last one, which is the most
dangerous mode of transport? Is it riding a bike
or driving a car? So give me-- give me a
cheer if it's riding a bike? [CHEER] Give me a cheer
for driving a car. [CHEER] That's pretty 50-50 actually. OK, let me do that
one more time. Riding a bike? [CHEER] Driving a car? [CHEER] I'd say that's about 50/50. I can tell you that
actually riding a bike is more dangerous than driving
a car, 550 deaths per billion hours versus 130 deaths. This is a really
interesting one though, because it raises all sorts
of questions as to how do we even measure risk? I've chosen to measure this
based on the amount of time a person would spend
doing each activity. Some people may choose to think
about the number of accidents per miles travelled. But cycling still comes
out as more dangerous. Some people might think to do
this, the number of journeys that you take. If we think about other
modes of transport, such as flying in
airplanes, the risk isn't constant the whole time
that you're up in an airplane. It's more risky as you
take off and as you land. So even just how
do we measure risk is a really interesting
question in itself. So now we've established
that you're, OK, at risk, I think that's a
fair assessment. As I said, I want to
talk to you about risks that you see every day and give
you a toolbox as to everything you should be asking. And I want to start out by
talking about the humble bacon sandwich. [LAUGHTER] Now, according to
the headlines bacon is one of the worst
things you can be eating. It causes all sorts of
different types of cancer. This headline here, say
in "Daily fry-up boosts cancer risk by 20%." So if you eat bacon
on a daily basis, you increase your risk of
pancreatic cancer by 20%. And that's a shocking
statistic, that actually caused bacon sales to plummet. Is it really though
something that we need to be worried about? Crucially, when we see
headlines like this, they're actually giving us
what we call relative risks. They're only telling us what
the risk is in one group relative to another. So I know that if
I eat bacon, I've got a 20% increased
risk compared to those who don't eat bacon. But I don't know anything about
what the risk actually is. And that's where
absolute risks come in. So absolute risks
depend on the numbers that are associated with
each of these things. So how do I take a
relative risk and turn it into an absolute risk? First of all, I need to
know what my chances are of getting pancreatic cancer. And according to Cancer
Research UK, we have a 1 in 80 lifetime risk
of pancreatic cancer. So what does that mean? That means if we were to
take 400 individuals who didn't eat bacon, we
would expect five of them to get pancreatic cancer anyway. My flicker has decided
to stop working. So if we then look
back at our headline, our headline says that our daily
fry-up boosts our cancer risk by 20%. 20% is a fifth. And what's a fifth of 5. It's just one, meaning
that our risk goes from five in every
400 individuals to six in every 400 individuals. It's only an extra one
person in every 400. So whilst that 20% increase
sounded really scary, a headline that said
it increases your risk an extra one person in every
400 wouldn't sound anywhere near as scary or anything. You didn't need to
be worried about it. There was also a headline
that said that bacon, ham, and sausages were now as big
a cancer threat as smoking, the World Health
Organisation were to warn. Now, the reason for this is that
the World Health Organisation produces these lists
of known risk factors for different types of cancer. And smoking was already
on there as a known risk factor for lung cancer. And they were saying that
because this processed meat was now being added to that
list for the first time, it meant that they were
as risky as each other. Now, these lists are
based on something called statistical significance. And statistical
significance just tells us whether or not
something definitely does or definitely
doesn't cause cancer. It doesn't quantify
that risk in any way. So how do the risks for
smoking and lung cancer compare to those
risks that we've just seen for bacon and
pancreatic cancer? So if we take our 400
individuals again, if you got 400 people
who don't smoke, you would expect four of them
to get lung cancer anyway. If you smoke 25 or more
cigarettes every single day, that goes up 24 times, to
96 in every 400 individuals. So that's an extra
92 in every 400 compared to that
extra one in every 400 for the bacon and
pancreatic cancer. So, yes, they may
both be statistically significant in causing cancer. But to say that they
now were as big a cancer threat as each other, they
were as risky as each other, is absolutely
ludicrous because we can see that there is a huge
difference in the risks. What we also need to think about
when we see these headlines is that it compared those who
eat bacon every single day with those who never eat it. And the risk was only
increased by an extra one person in every 400. If you only eat bacon,
say, once a week on a Saturday morning
as a treat to yourself, it's going to have an
even smaller effect. And so it's going to be
absolutely tiny, this risk, Plus, what you
also need to think about is if you're eating
bacon for breakfast every day, you're not eating fruit
for breakfast every day. You may be more likely to
have an unhealthy lifestyle in general. And how do we know that it's
the bacon that's actually causing this increased
risk of cancer, and it isn't another one of
these unhealthy lifestyle factors instead? And what we say in statistics is
that correlation doesn't always mean causation. Here are some of my favourite
headlines for demonstrating correlation versus causation. So, yeah, fizzy drinks
make teenagers violent. Fizzy drink a day make
teenagers behave aggressively. Children drinking fizzy
drinks are regularly more likely to carry a gun. [LAUGHTER] Now, it could be
that drinking fizzy drinks makes teenagers violent. Or it could be that there's some
other social demographic factor that means a teenager
is more likely to have fizzy drinks in their diet. And they're also more
likely to be violent. Or it could be that being
violent is thirsty work. And at the end of it,
you want a fizzy drink. [LAUGHTER] We don't know which way
around that relationship goes. One of the best ways to
think about correlation versus causation is if you
think about ice cream sales. As ice cream sales go up, so
do the number of drownings. [LAUGHTER] So does that mean that ice
cream causes drownings? Both of these
things are affected by something else, hot weather. As the temperatures increase,
we eat more ice cream, as temperatures increase, we
go into the sea more often, meaning that there are
naturally just more drownings. Once we actually take
that into account, that direct relationship
between ice creams and drownings disappears. We call this a
confounding factor. And once we account for
the confounding factor in our analysis, then
that direct relationship between these two
things disappears. There's a really nice
website that I like to go on, that allows you to correlate
all these weird and wonderful things with each other. So I got on there. And I've picked out
some of my favourites. If you google
spurious correlations, it's the first
one that comes up. You can have a lot
of fun with it. I spent too much time trying to
find interesting correlations to my presentations. But there is a 98%
correlation between the amount of money spent on
admission to spectator sports and the number
of people who died by falling down the stairs. So does this mean as
we're all spending money to go to these big arenas,
we're all falling down the stairs at the same time? I don't know. There is a 94% correlation
between maths doctorates awarded and the amount
of money spent on pets. Now, I'm a dog lover. Am I a dog lover because
I'm a mathematician? I don't know. My absolute favourite though is
that there is a 95% correlation between the per capita
consumption of cheese and the number of people
who died by becoming tangled in their bed sheets. So does this mean that we
shouldn't eat cheese before we go to bed because we might die? These are two things
that are obviously just-- they just happen to be
correlated with each other. And it doesn't mean that
one is causing the other. Now, you might be
saying to yourself, this is all well and good. This is all very funny. I know that cheese doesn't
cause death by bedsheet. When do I actually
really need to think about this in real life? I was asked to comment on a
story that was run by the BBC-- and it was January 2017, so
just over two years ago now-- that said, live nearing
a busy road increased your risk of dementia. It apparently increased
your risk of dementia by 7%. And the BBC got in touch and
wanted me to comment on it. And they wanted me to
talk about this relative versus absolute risk. So I went and I had
to look at the paper-- it was published in The Lancet-- just to try and get an idea as
to what the absolute numbers might have been. And whilst I was
looking at this study, I realised that they hadn't
controlled for family history in their analysis. And I argued that we know
there's a huge family history element to dementia. But you could also
argue that there's a family history element
as to where you might live. If you grew up in the
middle of the countryside, you might be more
likely to continue living in the
countryside as an adult. If you grew up in
a big city, you might be more likely to live
in a big city as an adult. And so you've got this
big family history element to dementia and
family history element as to where you might live. And I said the fact
that that wasn't accounted for in the
analysis was a major let down in the study. Also while I was
looking at this, I looked at the
supplementary material. And it looked to all
of these other things that they had looked
at, that might be associated with dementia. So this top row here are all
of these different factors that they thought might be
associated with dementia. So, for example, where
you can see that smoking, that 1.3 means that smoking
increases your risk of dementia by 30%. The obesity, so obese
versus a normal weight, increases your risk
of dementia by 64%. And yet the
newspapers had chosen to really focus on this living
near a busy road increasing your risk of dementia by 7%. And I said, you
know, before you went to pick up sticks and
move to the countryside, there are lots of
other things that you could do that
would have a bigger effect on your risk of
dementia, quitting smoking, losing weight, and higher
education versus lower education. But the newspapers just chose
not to report any of this. And so what I would
always say to you when you see these headlines, have
a little think to yourself, what aren't they telling us? What else could be going on? So I'm asked quite
a lot actually to comment on stories
that appear in the press. And this was another
one that I was asked to comment on at the
beginning of last year, that said that 2017 was the
safest year for air travel as fatalities fall. So in 2017, there were no
deaths anywhere in the world caused by passenger jet crashes. And early into 2018, there
then was a passenger jet crash. And there was all this
sort of investigation as to has everything gone wrong? Was 2017 the safest year
we're ever going to have? Do we now have to start
investigating to try and figure out what's happened? And so I was asked to
comment on this story. And I want to do a little
demonstration with you. So there are dice
in these two rows. Some of you will
have been given dice or had dice underneath
your seat as you sat down. Could you just wave you
dice subtly, please? So we're going to do a little
demonstration with these dice. And we're going to do a
little demonstration that thinks about these
things, speed cameras. So everybody's
favourites, I know. So when speed cameras
first came in, the government needed to
give some serious thought as to where they might put them. We couldn't put speed cameras
absolutely everywhere. Was there some sort
of sensible strategy that we could adopt, to decide
where we might put those speed cameras? And we're going to
recreate that exercise now. And what the government
decided to do was to try and identify
all the accident hotspots. And they said that
those were obviously the most dangerous places. And those were the ones that
were in the highest need of getting these speed cameras. And we're going to
recreate that now. So all of you who have got
the dice, in a second I'm going to ask you to roll them. I'm very aware of the fact that
there's not that much room. So can I just suggest you give
them a good shake in your hand and just drop them on the floor. But, yes, so I'm going
to ask you to do that. I want you to do it twice. And count up the
score that you get. And then we're going
to decide where we're going to put our speed cameras. So if you could all
roll them for me twice, that would be marvellous. There should be quite a few
more on that row actually. There should be more than one. Oh, you've handed them back. OK, that's fine. OK, right, so did
anybody get a 12? Anybody get an 11? Oh, I heard a twitch then. Anybody get a 10? I only got 1. OK, right, what we do is
we're going to redo this. I usually have more dice. I gave this talk last week
to a group of teenagers. And they stole my
dice because they all like to take souvenirs. So I'm now doing a probability
problem with fewer dice than I normally would have. So bear with me. And we'll just
repeat that again. So if you could all just
give them a really good roll and repeat that. Do it twice for me. And we'll see what we get. This is what happens when
you present to teenagers. They like souvenirs. OK, right, how about this time. Did anybody got any 12s? Any 11s. We've got two, OK,
right, brilliant. 3? 3? Sorry, it's the lights. OK, so I'm going to
give you a speed camera. So we got a speed camera there. You happen to be all spread
out as much as you possibly could be. Oh, thank you very much. And a speed camera over here. Can I just ask you to
pass that back behind? That'll be brilliant. Thank you. So now we've got our
speed cameras in place. We now need to see
if they've worked. So what I want you to do is
to all repeat the same thing again. Last time, I promise. If you could all just roll
your dice for me again, twice. OK, where we have speed cameras,
what did you get this Time 6. 6. 8. 8. 7. 7. So we can see where
we've got our speed cameras we've seen a reduction
in the number of accidents-- [LAUGHTER] --meaning our speed
cameras have worked right? Maybe not. OK, so this is a really
nice demonstration of what we call
regression to the mean. So what do I mean by
regression to the mean? Regression to the mean is
we all behave according to some sort of average. But we don't get exactly the
same value every single time. We have these random
fluctuations around it. I simulated here a
number of accidents, where I kept the average a
constant 10 all of the time. But you see that I don't always
get 10 accidents every month. Sometimes I see
higher than that. Sometimes I see lower than that. And this is what you would
expect to see just by chance. And crucially, it was
one of these random highs where I actually chose to
put in my intervention. You didn't have a
higher chance of rolling a high number that then changed
when I gave you a yellow hat. I chose to put the interventions
in those places that were randomly high
the first time around. And regression to
the mean tells me that I would have expected
them to be lower the next time around just by chance. It's regressing to the mean. So regression to
the mean tells you I've got something lower
the first time around. I expect it to be higher
the next time around. And this is exactly
what happened when speed cameras came in. The government put
them in those places with the highest
number of accidents. They saw there was a reduction. And then they said
that the speed cameras must have been working. And it took a group
of statisticians to come in and
say, actually, you need to be looking at this
over a long period of time to be able to say whether or
not the average is changing through time, whether or not
the average number of accidents is actually decreasing. So we see regression to the
mean of the time in sports. If you think about your
favourite sports teams, they'll go on random
winning streaks. And they'll go on
random losing streaks. When they go on losing
streaks, they sometimes sack their manager. And a new manager comes in. And they say, oh, look,
they're winning again, must be the manager. A lot of the time, it can
be explained by regression to the mean. That losing streak
is just a random low. And when they bring
the new manager in, they just go back to their
average performance level. And some research has shown
that actually those teams that stick with their managers
see the bounce back in form much quicker
than those that actually bring in new managers. There's something called the
Sports Illustrated curse, that says when you appear on the
cover of Sports Illustrated, it's a curse. You then go on to
perform really badly. But it can be explained
by regression to the mean. If you think about
what does it take to appear on the cover
of Sports Illustrated, you have to be at the very
top of your game, which is going to be a combination
of your natural ability. But you're probably also
going to be riding one of these random highs as well. And this curse isn't
necessarily a curse. It's just you then-- that random high
coming to an end. And you're going back
to your average ability. So I argued that when we looked
at this story here, all of this could be explained by
regression to the mean. We would expect the number
of air crashes and fatalities to remain low. But we are going to see
these fluctuations around it. Some years, we're just naturally
going to see slightly more. And some years we're just
going to see slightly less. And the fact that there
have been none in 2017, and then one in 2018,
didn't necessarily mean that everything
had gone wrong. And we all of a sudden
needed to be having these big investigations. There were also stories about
this time last year looking at the London murder rate
now beating New York, as stabbings surged. And there was a question
as to whether or not London was now a more
dangerous city than New York. And BBC's Reality Check
actually looked into this. So this is a really
good resource, that the BBC's Reality Check. So they get statisticians to
look at these sorts of claims. And the claim was that
London it overtake overtaken New York for murders. And it was now more dangerous. And they found that
a selective use of statistics over a
short period of time appeared to bear it out. But the reality was that
New York still appeared to be more violent than London. If you looked at it over
a longer period of time, then New York did appear
to be still more dangerous than London. So while we're on the
topic of airplanes, I-- a vice president of the
Royal Statistical Society. And some of the work
that I do with the RSS, I have a bit of a hobby. And I like to give
Ryanair a headache. [LAUGHTER] So it first started
out when I was approached by BBC's Watchdog. So Ryanair had changed their
seating allocation algorithm. It used to be that
if you'd booked as a group, when you
checked in you would all get to sit together. And then they
changed it and they said if you didn't book seats
together and pay for them, you would be randomly
scattered throughout the plane. And loads of people started
complaining to Watchdog, saying that they
thought there were too many middle seats
being given out. So the window seat
is quite desirable because you get the nice view. The aisle seat, you get a
little bit of extra legroom. The middle seat is seen as
the least desirable seat. But everyone seemed
to be getting them. And I thought that might be
something going on there. So they decided to send
four of their researchers on four flights. And on every single
one of the flights, they were all
allocated middle seats. And they got in touch
with me and said, hey, what's the chances of that
happening, if the seating allocation is truly random? So I did some stats for them. And then I went on TV and
I told them what I found. So it wasn't actually a
very complicated calculation that we did. They sent me the information
available at the time of check-in for each
of the four flights. So this is an example
of one of them. So when they checked
into their flights there were 23 window
seats, 50 middle seats, and 27 aisle seats available,
so a total of 65 seats. And using this, I can then
work out the probability that they're all
given middle seats. So the probability that the
first person gets a middle seat is 15 over 65 because
there's 15 middle seats and 65 seats in total. The probability then
that the next person is given a middle
seat is 14 over 64 because there's
now 14 middle seats available from 64
seats in total. And I carry on. So the probability of the
third person is 13 over 63 and the fourth
person is 12 over 62. And if I multiply all
of these together, that gives me the probability
of all four being middle seats. And it's about 0.2%. Which is one in 500,
which isn't actually that small a probability if you
think about how many flights Ryanair have every day. One in 500 of this,
it's not too surprising. But this is just one flight. As I said, they did it
on another three flights. And they all got middle
seats on those three as well. So I did the same calculations
for the other three flights. And then I combined
it altogether. And I found out the probability
of all four researchers get middle seats on all
four flights was around 1 in 540 million. So you were more than
10 times more likely to win the national lottery
than you were for this scenario to happen. But, you know,
tiny probabilities don't necessarily
mean rare events. So I went and had a look at
Ryanair's facts and figures. And they say that
they only carry 130 million annual customers. So I was pretty
convinced that not only was this a
small probability, it was a rare event. And I was suspect
as to whether or not there was something going
on with their algorithm. Now, they said, you know,
we've got our stats. You've got yours. My stats are right,
thank you very much. But anyway-- And it all kind of-- it kind of
died a little bit of a death. We got some media attention. It was in the newspapers,
but then not really very much happened. Until a couple of months
later, when 12 women all went on holiday together. And they all got middle seats. And they called the Telegraph. And the Telegraph then
called me and said, hey, we heard you did
some work on this. What are the chances? So I went through
everything that I'd done on the Watchdog story. And they got in
touch with Ryanair. And at that point,
Ryanair admitted that they'd been lying. They admitted that they actually
kept window and aisle seats. They held them back when
randomly allocating the seats because those were
the ones that people were most likely to pay for. So this random allocation
wasn't a random allocation throughout the whole plane. It was a random row
within middle seats that you were actually getting. And so, yeah, I was
really happy that I managed to get Ryanair to
admit to their customers that they'd lied. And I also managed to
upset them in the process because they didn't think
that the negative media attention, including the BBC
investigation, was warranted. So let's all feel
sorry for Ryanair. However, they are the
gift that keeps on giving. So last April, they released
the results of a customer satisfaction survey. They said the 92%
of their customers were satisfied with
their flight experience. I thought, really? I'd been on a
Ryanair flight, 92%? So I decided to take
a look at the survey. Now, bear in mind this
was an opt-in survey. So my argument was you're only
going to opt into a survey if you're really satisfied
with your experience and you want them
to know about it or you're dissatisfied
with your experience and you want them
to know about it. This was a survey that they
asked people to fill in. So this was the 92% here. But if we look at the
options that people got when filling
out this survey, they went from excellent to OK. So if you were dissatisfied
with your Ryanair experience, there was no way of expressing
that dissatisfaction at all. And I argued that you just then
wouldn't carry out-- you just wouldn't fill out the survey. You'd just exit, switch it
off, and you'd disappear. So basically what you're
asking was a group of satisfied Ryanair customers
just how satisfied they were with their Ryanair experience? And then we were
really surprised that once you combined
three of the columns, you got a high percentage. So I went into the Times
and I said as much. And they had a really
grown up response, where they said 95%
of Ryanair customers haven't heard of the
Royal Statistical Society. [LAUGHTER] 97% don't care what they say. And 100% said it sounds like
their people need to book a low-fare Ryanair holiday. I mean, the stats
in that are wrong because if 100% say we need to
book a low-fare holiday then 100% of them have heard of us. So the stats are wrong
to start off with. But one of the members of
the Royal Statistical Society noted that there were 130
million annual Ryanair customers. And if 5% of them had heard of
the Royal Statistical Society, that meant that it was 6 and 1/2
million Ryanair customers who had heard of the ROYAL
Statistical Society. And to be honest, we'd
probably take that. But there we go. So, yeah, I like to-- I like to give Ryanair
a headache as a hobby. It's quite fun. Interestingly,
though, my boyfriend is currently at the end of
his training to be a pilot. And Ryanair is one
of the big options that he might want to work for. So that's irony right there. So as I said, I'm a member
of Royal Statistical Society. And one of the big projects that
we've got for this coming year is actually trying to improve
data ethics in advertising. So why is this such an issue? I'm going to play
you a little advert. And we're going to
talk about adverts in a little bit more detail. [VIDEO PLAYBACK] [MUSIC PLAYING] - Pearl Drops cleans. Pearl Drops whitens. Pearl Drops protects. Pearl Drops shines. Pearl Drops 4D Whitening System,
not only whitens, but cleans, shrines, and protects, too. Ultimate whitening,
up to four shades whiter in just three weeks. Pearl Drops Tooth Polish,
go beyond whitening. [END PLAYBACK] So we're used to seeing
these all the time. And we're to seeing these
things all of the time, at the bottom of them. So there's some survey
that's been done. And so many people agree. Now, there's lots of
things wrong with this. First of all, agree
with what exactly? I mean, there are a lot
of claims in the advert. It cleans, it
whitens, it brightens. Which of these exactly
are they agreeing with? But a lot of time when people
hear I'm a statistician, it's like, oh, adverts, can't
trust anything, can you? You can't trust any
of the stats in there. They use such small sample
sizes that none of the results are reliable. And I want to talk about this
in a little bit more detail. So when you see this
52% of 52 people agreed, what should you be thinking
about when you see this? So I want to talk a little
bit about uncertainty. What do I mean by uncertainty? So if I was to take 10 people
and line them up here and ask them to flip a coin
10 times, I know that there is a 50/50 chance
of getting a head or a tail. But if they all
flipped it 10 times, they wouldn't all get
five heads and five tails. Some people might get six heads. Some people four heads, some
people might get 10 heads. That's what I mean
by uncertainty. In statistics, we talk
about the difference between probability theory
and statistical inference. So in probability
theory, we know the underlying probability. And yet we see noisy data
when we do experiments. So that coin example, I know
the underlying probability is 50/50. But I see noisy data when I do
different experiments with it. A lot of the time in statistics
what we're actually trying to do is to go the other way. And we're trying to
use samples of data that we know are noisy
and subject to uncertainty and use that to tell
me something about what the underlying probability is. So there I had 52%
of 52 people agreed. If I'd taken a
different 52 people, I wouldn't have seen
exactly 52% agree. And I'd have seen a
slightly different number. And a different 52 people
would have given me a slightly different
number again. And ultimately, what
I'm trying to do in statistics is
to take that sample and that piece of
data that I know is noisy and subject
to uncertainty, and use that to
tell me something about what the underlying
probability is on a population level. So what I'm trying
to do is I'm really trying to create a hypothesis
test to see whether or not that percentage that I'm seeing
is statistically significant? So what do I mean in
hypothesis testing, how do I carry that out? What I do is I formulate what
we call a null hypothesis. And a null hypothesis would
be that the observations are a result of pure chance. So my underlying probability
of people agreeing is actually just 50/50. It's all just down to chance. And what I then say is let's
assume that that's true. Let's assume a null
hypothesis is true. Let's assume that the data
I'm seeing is just random. And it's just by chance. What then is the probability
of me seeing the data that I've seen or seeing something
at least as extreme as what I've seen? So let's break that down. I understand that's quite a
lot to get your head around. So let's break that down
for this particular example. So my null hypothesis in this
example would actually be 50%. I'm assuming that these people
have got a survey that say, you agree that this toothpaste
whiten your teeth, yes or no? What pure random would
be if they just randomly ticked yes or no. And so across my
sample, I would expect it to be about half
yeses and half nos. That would be what would be
my pure random just by chance. So 50% would be what would
correspond to my pure chance. So then if I had, going in the
direction of agreeing or going in the direction of
disagreeing, that's actually giving me information. That's telling me
some of the people have got an opinion as to
whether or not they disagree or agree with that statement. And I've got my 52% here. And I know that that is
subject to uncertainty. So as I said, I know that if
I took a different 52 people, I'd get something slightly
different from this. So what I can do is I can put
a confidence interval on this. And this confidence interval
is related to the sample size. And it tells me, OK,
52%, that's the best estimate as to what the true
underlying probability might be. But what could it be? What values could
it possibly take? And a confidence
interval then gives me a range of values that
might be plausible. And as I increase
my sample size, I actually decrease the
amount of uncertainty. And I decrease-- I make my
confidence interval smaller. And what we're looking
for in hypothesis testing for a statistically
significant result is we want that confidence
interval to not cross the null hypothesis. So my null hypothesis
here was 50%. I want my confidence interval
to not cross that 50%. If it doesn't cross that 50%. I say it's a statistically
significant result. If it does cross
that, then I say I haven't got enough
evidence to say, actually that it's just pure chance. It could just be pure chance. So that confidence
interval really matters when I've got something
close to the null hypothesis because that 52% is
really close to that 50%. I'm going to need
quite a big sample size to be able to make that
confidence interval small enough so that it
doesn't cross that 50%. If, on the other
hand though, I had to get a result that was a lot
further away from that 50%, I wouldn't necessarily need
to have as big a sample size because it's not as close to
that null hypothesis of 50%. It doesn't matter if the
confidence interval is wider. So yes, when you
see these surveys that are being done
on small samples, it's not always a problem. It depends on how big
your result actually is. It's not just the
sample size in itself, but it's also what we say, you
know, the effect sizes as well. So just back to our example, we
have 52% of 52 people agreed. A 95% confidence interval on
this based on these 52 people is 38 to 66. So 95% means if I was to
repeat this a hundred times, I would expect 95% of
them to between 38 and 66. And it crosses that 50% mark. So here, this is no different
from just pure randomness. This is no different from
people just flipping a coin, saying yes or no, I agree
with that statement. If I was to take another
one that said 74% of 54 men agreed with some
statement after 28 days, the confidence interval
on this is 60 to 85. So it's a similar
kind of sample size. We've got similar sized
confidence interval. But because our treatment effect
was 74% to start off with, and that's a lot further
away than the 50, I have enough evidence here to
say that there is a difference. And actually people
do have a preference. And I'm just going
to finish off now with a couple of final graphics,
that just say-- because I don't think uncertainty
necessarily has to be very difficult to communicate. I think-- [LAUGHTER] When we look at
the weather and we look at when they tell you
your probability of rain, I mean, these numbers
are ridiculous. So what is it? I mean, at 3 o'clock we
have a 10% chance of rain. That goes up to 13% at 4
o'clock and 16% at 5 o'clock. What am I supposed to do
with this information? I don't know what the
uncertainty is on that. And they're really
precise point estimates. But it would be super-easy to
communicate the uncertainty using some sort of graphic. Now, graphics have the
ability to do great good. They also do have the
ability to do great evil. And I just want to finish off
with a couple of my favourite bad graphics because it is
something you really need to watch out for when you're
looking at stats in the media. So there's this one, which
is one of my favourites. This is the presidential run. So a pie chart should
sum up to 100%. [LAUGHTER] This doesn't. And they've obviously here asked
would you back this person, yes or no? And then thought
that a pie chart was the most appropriate way to
communicate that information. This one, I've got no
idea what they asked. Half of Americans have
tried marijuana today? I'm not-- I don't know
if I believe that. But if 43% of them have tried
it in the last year, which includes today, how
have 51% tried it? The numbers are all wrong. I can't figure out
what's going on. They have however, though,
included uncertainty. We know it's plus or minus 4%. But I've got no-- I've got no idea. This one from the Office
of National Statistics is a very sneaky one. And one that shows
you always need to look at the scale of the
plot because actually this is a increase in GDP. And it went from
0.6% to 0.7%, they're predicted growth upgrade. And it's a 0.1% increase. And that looks a lot
bigger on that plot. And if we look at the
axis along the bottom-- and look at the scale of that. If you zoomed out onto
that plot and looked at the whole percentage
line, it would be just a minuscule difference. So look at the scale. And my last one is my
favourite, from Ben and Jerry's. I don't know what world we live
in where 62 is smaller than 61. But I don't want to
live in that world. Here is an example where
Ben and Jerry's have got a story that they want to tell. And the numbers didn't
quite agree with that story. So they decided to
produce a graphic that told that story anyway. And hoped we wouldn't look at
the numbers in enough detail. Look at the size of that
24% compared to that 21%. They've got a clear story
that they wanted to tell. And if you were just
flicking through a magazine, you might not necessarily
look at the numbers in as much detail. So yes, so very, very
naughty from Ben and Jerry's. So if you look at
stats in the media, I would encourage
you think, relative versus absolute risks,
correlation versus causation. Could this have happened just
by chance, this regression to the mean? Yeah, eat bacon. But don't eat cheese
before you go to bed. Thank you very much. [APPLAUSE]
Sounds like she's been sent out to get us to ignore risks, especially when avoiding that risk will cost a capitalist money. After all, what does throwing dice have to do with car accidents, with or without cameras? Nothing.
Or as XKCD says: Correlation doesn't imply causation, but it does suggestively wiggle its eyebrows and say "look over there".