STEVE MOULD: This is
Benford's Law. And it's about numbers, but it's
about the leading digit. For example, you could look at
the populations of all the countries in the world and
look at the leading digits of all those. So for example, if it was 1,269,
then the leading digit in that case is the one. Benford's Law works on a
distribution of numbers if that distribution spans quite
a few orders of magnitude. And the brilliant thing about
populations of countries is that it actually goes from
tens up to billions. If you were to think about
that, OK, what are the distribution of leading
digits. So some of the populations will
start with the one, some will start with two, three,
four, five, six, seven, eight, or nine. And so there are nine possible
leading digits. And you might imagine that each
one of those possible leading digits are equally
likely to appear. So that's one in nine-- 11%. And if I was to plot that on a
graph, you might expect that to fluctuate around 11%. So it's going to go like that. So what actually happens is
that a third of the time, that's up here. A third of the time the
number you choose will start with a one. And it will hardly ever
start with a nine. So nine is down here-- tiny number. And then you get this
brilliant curve that goes up like that. Isn't that crazy? BRADY HARAN: I know you talk
about this sometimes in talks and things you do. What's the reaction to
that normally when you tell people this? STEVE MOULD: The reaction? The noise is sort
of like this-- ohh. And there's a certain
amount of disbelief sometimes as well. And the way we do it actually
in the show is that we get people to tweet numbers to us. So we're collecting numbers, and
I try to give them ideas. So maybe like, take the distance
from the venue to where they live and convert that
into some strange units. Or something like that. The interesting thing is, like I
was saying, it works so long as the distribution you're
choosing from spans loads of orders of magnitude. But if you're picking numbers
from lots of different distributions, the individual
distributions don't have to span lots of orders
of magnitude. The meta-distribution of
individual things picked from different distributions follows
Benford's Law anyway. So it works brilliantly well. BRADY HARAN: What clump
of numbers will this not work for? STEVE MOULD: Human
height in meters. So humans are between one
meter and three meters. So it doesn't work for that. You get a massive load
around there. And no one's nine meters tall. Anything that has that short
distribution, it doesn't work for. But it does work for several
distributions put together that don't necessarily
individually follow the rule. So I did it for populations. I did it for areas of countries in kilometers squared. If you take one number and
convert it to loads of different units, that
will tend to follow Benford's Law as well. You can do it for the
Financial Times. Look at all the numbers
on the front cover of the Financial Times. They will tend to follow
Benford's Law as well. BRADY HARAN: Just a quick
interjection-- you can also apply this to the
number of times you watch Numberphile videos or leave
comments underneath. More information at the
end of the video. STEVE MOULD: So the explanation
is to do with scale invariance, which
I'm just getting my head around now. But there are a couple
of intuitive ways of understanding it. One of them is to use the
idea of a raffle. To begin with, it's a
very small raffle. So there are only two tickets
in this raffle. What are the chances of the
winning ticket in this raffle having a leading digit of one? Well, that's this one. So it's one in two. It's 50%. But then if you increase the
size of the raffle, so there are now three tickets in the
raffle, the chance now are one in three or about 33%. If you add a fourth ticket, then
the probability of the leading digit of the winning
ticket being a one is now 25%, and then 20%, and so on and so
on until you have a raffle with nine tickets in it. And then the probability of the
winning ticket having a leading digit of one
is one in nine. It's 11%, which was the
intuitive thing that you might think. But then you add your
tenth ticket. And now there are two tickets
that start with a one. So now the probability
is 2 in 10 or 1 in 5. So it would go back up to 20%. The probability will go up, and
up, and up as you add more tickets that start with a one. And once you have a raffle with
19 tickets in it, you're up to something like 58%. And then you add the
20th ticket. And so the probability
goes down again. So the probability of the
winning ticket having a leading digit of one will go
down, and down, and down through the 20s. It will go down through the
40s, down through the 50s, 60s, 70s, 80s, 90s, until you
add the hundredth ticket. And then the probability will
start to go up again. And then the probability will go
up, and up, and up, all the way through the 100s. And then you get to the 200s,
and it goes down, and down, and down through all the 200s,
300s, 400s, 500s, 600s, 700s, 800s, 900s. And you'll be back to
11% again then. Then you add the thousandth
ticket. And the probability will
start to go up again. So the probability goes up,
and up, and up through the thousands and then down through
the 2000s, 3000s, blah, blah, blah. And then you get to 10,000
and it goes up. And so basically the probability
of the winning ticket having any digit of one
fluctuates as the size of the raffle increases. And so this is a log scale of
the raffle increasing in size. So you might have a 10, 100,
1,000, 10,000, and so on. And then this is the probability
here of having a leading digit of one. It goes like that. What Frank Benford realized was
that if you pick a number from a distribution that spans
loads of orders of magnitude, or if you pick a number from
the world and you don't necessarily know what the
distribution is in advance, then it's like picking a ticket
from a raffle when you don't know the size
of the raffle. So you have to take the average
of this wiggly line, which is what he did. So that's the average there. And it's around 30%. There's a formula for it, which
is the probability of picking a number with a
particular leading digit of d is equal to log to base 10
of 1 plus 1/d, like that. And so that's how you do it. And if you plug one into
there, then it's log base 10 of two. And it ends up being
about 30%. The beauty is that you can
do it in any base. So this doesn't have
to be base 10. It could be base five, base 16,
whatever you want to do. You can apply Benford's Law
to different bases. This is a formula that a
forensic accountant would use as a tax formula of something
like that. If you're making up numbers in
your accounts and the numbers you make up don't follow
Benford's Law, then that's a clue that you might
be cheating. So this is a formula you need to
remember if you're going to cheat on your tax return. BRADY HARAN: A lot of things
that mathematically inclined people like yourself tell me
when I hear about them seem counter-intuitive. And then you cleverly explain
why it works the way it works. This is one of the few things
that when I've heard about it, this just seems logical to me. When someone says one will come
up more often, to me that just seems like, of course
that would happen. STEVE MOULD: Yes. Funny isn't it? Some people are like you. I would say you're in the
minority of people that go, well, yeah. And I wonder if there is another
intuitive way of looking at it that you've tapped
into, which is that if you imagine something like
the NASDAQ index or something like that-- and I don't know what the NASDAQ
index is size-wise-- but imagine that the NASDAQ
index is at 1,000. To change that to 2,000, you'd
have to double it. So the NASDAQ index would have
to increase by 100% to get from something that starts with
a one to something that starts with a two. So that's quite a big change. But if the NASDAQ index was
on 9,000 and you wanted to increase it to 10,000, then
that's an 11% increase. So it's hardly anything. So basically, you don't really
hang around the nines. As things are growing and
shrinking, you don't hang around, whereas you do
hang around the ones. And maybe that's intuitive
to you. So you're like, yeah
obviously. BRADY HARAN: If you'd like to
see even more about Benford's Law, we've done a bit of a
statistical analysis to find out whether or not your viewing
habits and the number of times you comment on
Numberphile videos is following the Benford curve. The link is below this video
or here on the screen. So why don't you check it out?
Actually, the beginning number of all sorts of things tend to favor one, and the distribution of starting numbers tends to mimic logarithms (one is most frequent, nine is least frequent).