This is the continuation of our previous video
where we revealed that we found the asymptotically optimal algorithm for factoring. You can check it
out if you haven't yet, but hopefully this will make sense even without it. As many of you have
guessed by the date that we published our previous video, it wasn't completely honest. The fact that
the video suggested that we solved one of the biggest open problems in computer science may also
have been a clue. But you know what? Although the video was heavily misleading, what we said there
was actually true! Remember, we said that we have a concrete asymptotically optimal algorithm for
factoring composite numbers. Before explaining our algorithm, we need to understand the following
pressing question - why aren't we famous? After all, it's safe to say that even determining
whether there is a polynomial time algorithm for factoring would be worth pretty much any price
a mathematician or a computer scientist could dream of! And so, the only possibility, however
weird it may sound, is that although we know the asymptotically optimal algorithm, we
don't know what its time complexity is! Okay, but even if we can't compute
the time complexity of our algorithm, why not just run it on real instances like the
one mentioned in the previous video? If we can solve the factoring problem really fast,
we can break the famous RSA crypto system, so who cares that we don't have a proof that the
algorithm is fast! Since we're not famous yet, the only possible conclusion is
our algorithm must be very slow in practice.
Remember, in computer science, we're using asymptotic complexity because it's
often hard to distinguish whether the complexity of our algorithm is, let's say this, this, or
this function. So, to make our life simple, we ignore the leading constant and the
lower order terms and just call all of these functions O(n^2). And that's reasonable
most of the time but what if the complexity was 1 million times n^2? That's also O(n^2), but the
algorithm would be extremely slow in practice. So, asymptotic complexity can sometimes be
deceptive. That normally doesn't happen, but you'll soon see that our case is, well,
very exceptional. Constant factors in our algorithm are so ridiculously massive that
it struggles to factor even the number four. So, with these two points in mind, let's now
have a look at what our algorithm actually does. You may have heard of the infinite monkey theorem
- it says that if you give a monkey a typewriter, after enough time of mashing keys randomly, it'll
eventually manage to write the complete works of Shakespeare. But you know what? With enough
time, the monkey will also write all kinds of valid Python programs (and in the remaining cases
Perl programs). And now even though I don't know if there is an efficient algorithm for factoring,
if there is one, the monkey will eventually manage to type it out in Python, given enough time. And
this is exactly the idea behind our algorithm. Instead of hiring monkeys, we're just going to
iterate over all strings in their lexicographical order and try to interpret each one as a Python
program. We'll give the program the number we want to factor, let's say 4, on standard input we'll
run the program and check if by chance it printed the factorization of our number to standard
output. Of course, most of the time we'll simply get a syntax error, but on the few occasions when
the program does finish, we'll check if it printed two numbers whose product is 4. Needless to say,
this also usually doesn't happen, in which case we just continue with the search. But there is
an issue with this approach - at some point, we'll encounter a program with an infinite
loop like this one. And when that happens, our search will just get stuck forever.
So, we'll have to be a little bit smarter. We'll maintain a list of simulated programs. At
the beginning, this list will be empty and we're going to proceed in iterations. In iteration k,
we first add the k-th lexicographically smallest program to this list and then simulate
one step of each program in the list. In this animation, each gear represents simulating
one step of a program, where you can think of one step as one CPU instruction - for example, adding
two numbers, assigning a variable, and so on. After we're done with all the programs in
our list, we go to the next iteration - add the next program and simulate one step
of each program in the list, and so on. Of course, sometimes it'll happen that
the simulation of a program finishes either because the program returned an
answer or, more likely, it just crashed. Then, we check whether the output of the program
is two numbers whose product is equal to our input number. If not, we just continue with the
simulation. But in the unlikely case that the finished program actually returned the correct
solution, we print it to the output and terminate the whole search procedure. Fortunately, this
final check can be done very quickly and notice by the way that this is the only place where it
actually matters that our problem is factoring and not something else. And that's the whole algorithm
- it was discovered by Leonid Levin and is known as Universal search so let's use that name from
now on. Perhaps you can already see how it's possible that we have no idea what the asymptotic
time complexity of this algorithm is and why it's so slow that it struggles to factor the number 4.
But the only thing we promised was that Universal search is asymptotically optimal. In other words,
we promised that whenever there is some factoring algorithm with time complexity f(n), then
Universal search has time complexity O(f(n)). For simplicity, I just explained the version
of universal search that has a bit of a weaker property - whenever your factoring algorithm has
time complexity f(n), I can prove our Universal search has time complexity O(f(n)^2). I'll now
explain this complexity and then I'm going to tell you how to improve it to O(f(n)). So, let's
say you give me some program that needs f(n) time to factor numbers with n digits. To have a
concrete example in mind, we can think of the simple program that tries to divide the number
by 2, 3, 4, and so on until it succeeds. The most important observation is that Universal search
will at some point start simulating your program. For this program, it'll be the iteration
number - well roughly 10^160. Let's call this number L. L is ridiculously huge but what's
absolutely essential is that it's a constant, it doesn't depend on the length of the input
number that we want to factor. It only depends on the source code of your program. OK, so what
happens next? In each subsequent iteration, we simulate one step of your program so if it
takes f(n) steps to finish, we'll need to do that many additional iterations of universal
search to simulate your program to completion. Since your program factors the number correctly,
at this point Universal search will definitely terminate. Of course, it might also terminate
even earlier, because of some other factoring program that has already finished. So what's
the total number of steps that we need to make? Well, look at this triangle diagram. It shows how
many steps were simulated in total. Of course, some of the simulated programs may have finished
much earlier, but in the worst case, the time complexity of Universal search is proportional
to the number of dots in this picture. Since it took f(n) iterations to simulate our program, we
started the simulation of another f(n) programs in the meantime. So, the area of this triangle is
roughly one half of (L + f(n)^2). And remember, L is just a constant so this is simply O(f(n)^2)
steps. Actually, there is one thing that we swept under the rug - checking whether the outputs of
the finished programs are correct also takes some time. Fortunately, if we use the fastest known
algorithms for multiplication instead of the standard algorithm, the time we spend checking is
negligible. And this finishes the analysis of the basic version of universal search. But how do we
improve O(f(n)^2) to O(f(n)) as I promised? Well, you can do that by tweaking the algorithm a
little bit. Instead of simulating exactly one step of every program, in each iteration
we allocate exponentially more steps for older programs. The complexity of this improved
Universal search becomes roughly 2^L times f(n) which is even more ridiculous than before, but
Big O can hide big constants, so it's O(f(n)). This is also how we implemented Universal
search in our code. Speaking of implementation, there is a tiny difference between our code and
our explanation. Since we wanted to make our code readable, we didn't want to go through simulating
all Python programs. So we asked Chat GPT and it told us about this amazing high-powered
language called brain [ __ ] which is also so simple that we could write its interpreter.
The rest of the code has two classes. Universal search implements the whole functionality of our
algorithm that doesn't depend on which problem we're actually trying to solve, and the class
Universal factorization inherits from Universal search and implements the only problem dependent
part - checking if the output of a simulated program is correct. I hope this makes it clear
exactly how general the idea of universal search is. For example, let's say we want to solve
another classical problem - sorting. Then, we just need to write a new checking function for
sorting and, voila, we now have an asymptotically optimal algorithm for sorting. In general, the
only property of the problem that we're using is that there exists a linear time algorithm that
can check if a proposed solution is correct. So this was Levin's Universal search and I expect
many of you might now be a little bit confused about how to understand it. So let me finish with
a bunch of thoughts about what the lessons to take away from it are. First of all, what I don't
want you to take away from this is that asymptotic complexity is a broken concept. Let's say that
we want to analyze some algorithm. There is a bunch of approaches that we could take. First, you
could code the algorithm and empirically measure how fast it is as a function of input size. Or you
can just compute the asymptotic complexity. Or, if you're a hardcore theoretician, you can
just check whether it runs in polynomial time on a Turing machine. You can see how these
options get increasingly more abstract and easier to work with. On one extreme, analyzing
an algorithm empirically is usually quite tricky, because there are a lot of details that you need
to get right. Computing asymptotic complexity is usually reasonably easy and determining whether an
algorithm runs in polynomial time is often obvious just by looking at it. But as these concepts get
more abstract, they also get less informative. Just knowing whether an algorithm runs in
polynomial time doesn't really tell us that much about it. Asymptotic complexity tells you quite
a lot about an algorithm but in many cases, like when you need to choose one out of several sorting
algorithms with the same complexity, you'll really need to sit down and do some empirical testing.
And so you can see that asymptotic complexity often lies in the sweet spot - it's usually quite
easy to compute, while still being informative. This makes it the right level of abstraction
for many problems. And sure, it doesn't capture that some algorithms like Universal search have
really large constants that it hides, but most of the time it's just fine. Asymptotic complexity
is just a model that computer scientists use, similar to physicists using frictionless surfaces
that don't actually exist. All models are wrong, but some of them are useful. But let's go back to
Universal search. Apart from being really weird, is it actually useful for anything? Well, already
the fact that it is weird means that knowing it can help you quickly disprove conjectures and
build intuition. It is similar to all those weird functions you might know from calculus class like
this one, or that one, or this one. All of these are really weird and it's good to have them in the
back of your mind. And the same goes for Universal search. But there is also one direct application
of Universal search, albeit a very theoretical one. If you think about a concrete problem like
factoring, a priori it's not even clear whether there exists an asymptotically optimal algorithm.
Why can't we just have a sequence of faster and faster algorithms without there being an
asymptotically fastest one? Although it's possible to construct some absolutely unhinged problems
where this does happen, in practice that's not the case. And the reason is the optimal algorithm
is Universal search. So that's why in practice you don't get a sequence of faster and faster
algorithms. In any case, I hope you enjoyed our videos about Universal search - an asymptotically
optimal algorithm for many problems. The content of this video was extremely subtle, so let us
know if you're confused about anything and if you enjoyed these less or more than our usual
algorithm explainers. And before I go, there is two things that I wanted to mention. The first
one is we've set up a Patreon for the channel, so if you'd like to support us, please check out
the link in the description. As a patron you'll get beta access to the videos. We'll release
an early version to our patrons and then you can send us feedback and your suggestions will
make it to the final product. And another thing is I wanted to quickly plug a project that we've
been working on with a few other friends, namely impromptu. impromptu is an AI based party game
where you use AI to generate weird images like cats sword fighting and then the other players
see the image and they try to come up with fake prompts that might have generated the image. And
then you try to guess which prompt was the real one. So if that sounds like fun to you, check out
the game, it's linked in the video description, grab a few friends and give it a go. And with
that I'll see you in the next one bye bye!