- Thank you to ExpressVPN
for sponsoring this episode. Hi, there welcome to
Up and Atom, I'm Jade. Today's video is about one
of the deepest theorems in mathematical logic and a fundamental limit
on human understanding. We'll explore logical paradoxes, the nature of information, and the possibility of
an algorithm for truth. My hope is that by the end of the video, you'll have a new appreciation for the beauty and power of mathematics. So are you ready? Our journey begins with an odd question, what's the biggest
number you can think of? Maybe it's a googol, a
1 followed by 100 zeros. To give you an idea of
how big this number is, there are between 10 to the power of 78 and 10 to the power of 82 atoms
in the observable universe. It's a pretty big number, but what about a googol plus 1. Whatever number you think of, can't you just add one more, or multiply it by itself, or take it to the power of itself? Numbers go on forever, so is it even possible to
think of a biggest number? Also thoughts of vague, maybe
you're thinking of a number, but you can't quite say what
it is, it's more of an idea. So let's make the question
a bit more concrete, let's instead ask, what's the biggest
number you can describe? Let's think about this question. If there is a biggest
number you can describe, logically, that means there are some numbers
you can't describe, right? There's bigger than the biggest
number you can describe, but why shouldn't you be able to describe any number you like? Well, while there are
infinitely many natural numbers, our lives aren't infinitely long. So even if I started counting
a new number every second, from the moment I was born, and lived to be 100 years old, I could only count to 3,155,760,000. But this is a pretty stupid
way to describe that number. In fact, just saying the number is already a description of it 'cause you know what
number I'm talking about. I could also describe it as the number of seconds in 100 years, which took even less time. So here we've just described a number in three different ways. One took a whole lifetime and
one took only eight words. But are there some
numbers we can't describe? Well, what about a number
that was a googol digits long? If we tried to write out this number, it would fill the space out to the most distant visible star. It would be impossible for a human to read out all of these
digits in their lifetime. But maybe there's a clever, or a shorter way we can describe it. But imagine the number is so irregular that there simply isn't
another way to describe it, other than just reading
out its digits one by one. There is a realm of natural numbers that simply can't be
described by a human being. As you can't describe all numbers, it logically follows that there must be a biggest
number you can't describe. So the next number after that is the first number that
cannot be described by a human. But wait, didn't we just describe it? This is Berry's paradox, named after the Oxford
librarian, G.G. Berry. It can be restated as the
smallest positive integer that cannot be described in
fewer than 15 English words. The paradox being that
we've just described it in 14 English words. At the moment, this might all
seem like trivial wordplay or language trickery, but we'll see that it reveals
one of the deepest theorems in mathematical logic. How? Well, the story starts
in 1942, Cleveland, Ohio with a 16 year old boy
named Ray Solomonoff. Solomonoff was searching
for a general method to solve mathematical problems
and algorithm for truth. Now, because the rest of
the video revolves heavily around this idea, I want you to really understand it, both to do justice to Solomonoff's genius, and also so you can appreciate
the brilliance involved. So let me show you what I
mean by algorithm for truth. Here's me playing rock paper
scissors against an AI. Anyone who's played knows
that the best way to win is by guessing what the
opponent will choose, and then picking the object that beats it. The AI is programmed to
detect my choosing patterns, but I'm trying to be random. The more we play the better
the AI models what I'll choose. It's aware of a pattern in
my behavior that I'm not. You might be thinking, Jade, maybe you just suck
at pattern detection. I can tell the difference
between randomness and a pattern. Oh yeah? Here are sequence of numbers, can you identify the
pattern or rule they follow? Pause the video if you'd like to try. They're the digits 100 to 1000 of pi. I took out the first 99, so it wouldn't be obvious to
the mathematicians out there. They certainly don't look like
they follow a simple rule. In fact, to us, they look extremely
irregular, even random, yet they can be generated
by a very basic formula. What other information is out there that we think is incredibly complex, but in fact, follows a simple rule? Maybe predicting the
weather is a piece of cake, and we just haven't figured it out. Maybe the theory of everything that is the ultimate end goal of physics, is staring us right in the
face, and we just can't see it. Solomonoff wanted a step-by-step
set of instructions, where we could plug in any set of data and it'd spit out the general
rule the data follows, or tell us if there isn't one and we should stop
wasting out time looking. A clear step-by-step set of instructions is called an algorithm. It sounds crazy, but so crazy it just might work. The first problem was, what's the best way to figure stuff out? Scientific phenomenon
all seems so different. How could one algorithm
possibly solve all problems? Well, although the ways
we understand the world are very different, there are some things they have in common. Usually we observe the world, record data, and try to figure out a
law that describes it. This process is known as
the scientific method. Sometimes there are many
hypothesis to explain the data, but it's a general rule that
the simpler the explanation, the more likely it is to be the right one. This is summed up in Occam's Razor, the simplest explanation
is usually the best one. It's called Occam's Razor because it shaves away
unnecessary assumptions. Solomonoff created a general
theory of inductive reasoning based on these ideas. There was a problem though. It's not always obvious
what the most elegant or simple solution is, these are highly subjective words. Solomonoff needed a way
to objectively measure the complexity of different hypotheses. It's here in our story that we're joined by two other characters, Gregory Chaitin and Andrey Kolmogorov. One American, one Russian, both extraordinary mathematicians. So the problem was, how do we compare the
complexity of different things? What's more complex, a Mozart
piece or a Hemingway novel, a sea slug, or Fermat's Last theorem? It may not seem like it, but there is something that all of these things have in common. In fact, something that
everything in the universe has in common, it's all made up of information. Information can be thought of as the resolution of uncertainty. The most fundamental unit
of information is the bit, usually represented by a 1 or a 0. How in the world can all of these things be made up of 1s and 0s? Well, think about this, everything your computer or phone does is represented and
understood in terms of bits. Every single program
ran on the computer is, at the most fundamental level, just a series of 1s and 0s. In principle, all of your sensory input and the laws of the universe can be represented by a
long sequence of 1s and 0s. These are called strings. So now, rather than dealing
with vastly different entities, we can convert everything into strings. When things are in the same language, it becomes possible to
measure and compare them. Now we just need a way of deciding which sequence of 1s and 0s is simpler and which is more complex. So how do we do that? Well, let me ask you, of these two strings, which
do you think is simpler and which do you think is more complex? A good place to start, might be asking, which is easier to describe? The first string is just the
digits 10 repeated 25 times. We can program a computer to print it by giving it the
instructions or algorithm, print 10 times 25, stop. Of course, the exact phrasing will depend on the programming language, but you get the idea. So some strings can be
described in ways shorter than writing out the whole string. Just like some numbers can
be described in ways shorter than saying every single digit. But how about this string? There doesn't seem to be an obvious way to tell the computer to print it other than literally saying, print, then naming every
single digit, stop. Now let's remember what
these 1s and 0s represent. They're the basic unit of information, and information is the
resolution of uncertainty. So the more information
needed to describe something, the more uncertainty it
had to begin with, right? The less bits it needs, the less uncertainty it had to begin with. I think it's pretty intuitive to say that the less uncertainty something has, the simpler it is, don't you? This is what our team thought too, and they defined the measure
of complexity of an object to be the number of bits
needed to describe it. In other words, the longer the string, the more complex it is. Compacting strings into shorter strings is called data compression. But we don't want just any shorter string, we want the shortest string. The expression isn't, any simpler explanation
is usually the best one. It's, the simplest explanation
is usually the best one. The shortest possible string
used to describe an object is called the Kolmogorov
complexity of that object. So here we see that Solomonoff's goal of an algorithm for truth, is the same thing as
programming a computer to tell us the Kolmogorov
complexity of any string. Before we go on, I just
wanted to pause for a sec to take a moment to
appreciate the enormity of what's been done here. We've basically boiled all
of the scientific method down to the art of data compression. We've linked philosophy, mathematics, computation,
information, complexity, randomness, reasoning, and physics, all in one theory, incredible. The idea of an algorithm for truth suddenly doesn't seem so crazy, but finding the shortest
string is not an obvious task. Just like with numbers, there are many ways to describe a string. Is print 10 times 25, stop really the most compact way
to describe this string? If it is, how do we know? And if it's not, how
do we find out what is? Let's consider a program,
FindShortestString. We give the program a set of data, encoded as a string of course, and it carries out the following steps. It goes through every possible string, starting from the shortest, 0, then 1, then 00, 01, and so on. It then interprets each
string as a program in some programming language. Remember that a program is
just a sequence of 1s and 0s, so some of these strings
will inevitably be programs. If the string cannot be
interpreted as a program, FindShortestString moves on. If it can be interpreted as a program, FindShortestString runs it. It then returns the output
of the program as a string. Next, it compares this output against the original data
string we used as input. If it matches, FindShortestString stops, if it doesn't match, it moves on to the next
string in the list. This would be a pretty good candidate for an algorithm for truth. For example, let's pretend we don't know what this sequence of numbers is, and use it as input
into FindShortestString. So we want FindShortestString to return to us the rule
that generates this data. So it starts going through
every possible string, interpreting them as programs
and returning the output. At some point it will
encounter this string which represents the formula for pi. FindShortestString will interpret it and run it as a program, and its output will be the
digits of pi encoded as a string. So of course, when we compare
it to our input string, it will match, and FindShortestString will terminate having found the string which
best describes the input data. Because we're starting
from the shortest string and going through an increasing order, it will be the shorter string
that describes the input data. We've computed the Kolmogorov complexity, we'd have our algorithm for truth. Hallelujah. But wait, there are still
loads of unsolved problems. Scientists are still
employed all over the world. Maybe in our excitement,
we've overlooked something. All programs and made of
a finite number of bits. So let's say, FindShortestString
is made of 100 bits. Now consider another program, Berry's paradox made of 40 bits. We give Berry's paradox
a number m as input, and it must return the first string that cannot be described
in less than m bits. So say we give it the number 150, the task of the Berry's paradox program is to find the first string which can't be described
in less than 150 bits. This is pretty easy if we use the help of FindShortestString. Berry's paradox goes
through all strings in order and feeds them to FindShortestString. The first time FindShortestString returns a string larger than 150 bits long, means the input that was
given to FindShortestString is the first string
that cannot be described in less than 150 bits. Berry's paradox returns that string. But wait, there's a slight problem, have you spotted it? We just described the first string that can't be described
in fewer than 150 bits using only 140 bits. This is a contradiction. If the program FindShortestString
was computable, it would produce countless contradictions, just like this one. If we were to actually run this program, it would loop forever,
never giving us an answer. It doesn't matter how
big the programs are, billions, even trillions of bits, there will always be a
string of longer length. And so there will always be
room for the contradiction. Kolmogorov complexity is uncomputable. Solomonoff's algorithm
for truth is impossible. It's not that we haven't found it yet, or that we're not clever enough, it's proven to be impossible. This may have been a blow to Solomonoff who worked his entire life on this idea. But if old Solomonoff
could go back in time, and tell young Solomonoff not to bother with a fruitless endeavor, I don't think he would have. This search led to the field of algorithmic information theory, the study of information, computation, complexity, and randomness. Wherever there are patterns to be studied, algorithmic information
theory can be applied. What I find most fascinating are the philosophical implications. This result proves an intrinsic
limit on human knowledge. It also shows so beautifully that where philosophy
can ask the questions, mathematics can answer. I'm stoked to have been able
to share this story with you, and I hope it's given
you a new appreciation for the power of mathematics. Algorithms have come a long
way in the past few years. You'd think that with all
the advances in technology, your internet security and
privacy would be a given, but it's not. By default your internet service provider can see every single site that you visit, even in incognito mode, or when you clear your browsing history. They are legally allowed
to sell all your data to advertisers and companies for profit. But you can protect
yourself with ExpressVPN. ExpressVPN reroutes your connection through secure encrypted service. So your ISP can't see what
websites you're visiting. It can also reroute it to a server in any country of your choice, making Jira restrictions
a thing of the past. My favorite thing about using ExpressVPN is all the extra shows and
movies I can watch on Netflix that aren't available in
Australia, like "Inception". I just changed my location to Germany, and boom, I can watch "Inception", or I can change it to Hong Kong and watch my favorite
childhood anime, "Naruto". If you're based in the U.S., you can change your location to Australia and watch "Rick and Morty". Find out how you can get three months free by clicking the link in
the description box below, or go to expressvpn.com/upandatom. Thank you to ExpressVPN for
sponsoring this episode, and I will see you in the next one, bye. (upbeat music)