- [Jade] This video was
made possible by Nebula. What is the fastest way
to solve any Rubik's cube? How do you complete a level
of Super Mario Brothers in the shortest possible time? How can we predict how
a protein will fold? These questions hold the key
to unlocking unimaginable advancements in technology,
security, and optimization. Yet to this day, they remain unsolved. They're part of a group of
over a thousand problems that share an amazing quality solving Solving just one of them
would solve all of them. Because of this, there's
a $1 million prize waiting for whoever solves one. But why would finding
the fastest way to solve any Rubik's cube tell
you the optimal place to put a hospital? Why would finding a way
to solve any Sudoku puzzle tell you the cheapest way
to fly around the world? Hi everyone. Welcome to
Up and Atom, I'm Jade. In this video, we are going
to explore how some of the world's most challenging
problems are all connected in a deep and mysterious way, and why even though a solution
to any one of them would radically transform our
world, they remain unsolved. To understand the theory
of NP-completeness, let's start at the beginning. What does this claim actually mean? What kind of problems are
these and what would it mean to solve one? Let's take this one. You
want to organize a party. Unfortunately, not all of
your friends get along. You'd like things to go smoothly, so you want to make sure that
you don't invite two people who clash and you want to
invite at least three people. Can you do it? We can represent your friend
network as a graph where each friend is a node. If they get along, they're
connected by an edge. If they don't get along,
they're not connected. Clearly, the problem is that
Dennis and Bob don't get along. So you can either invite
Alice, Bob, and Charlie or Alice, Dennis, and Charlie. This example was small
enough to easily figure out, but what if you were actually popular and your friend network looked like this? Here you have 20 friends. Can you throw a party of 12
so that everyone gets along? Now, it's not so obvious. This is a famous problem in
computer science called clique. A clique refers to a set of
nodes that are all connected by an edge. So in this case, the
question is can you find a clique of 12? However, when computer scientists refer to the clique problem, they
don't just mean one specific instance like 20 nodes
with a clique of 12. They mean for every
instance of the problem, for any graph with any
number of nodes and for any clique size, how can you
answer what is essentially infinitely many questions? An algorithm is a set of
step-by-step instructions that you can mindlessly follow,
kind of like a cake recipe. Computers are very good
at carrying out algorithms because they're good at
processing simple instructions very quickly. Let's see if we can come up
with an algorithm to solve this specific clique problem. This network has 10 nodes. Can we write an algorithm to tell us if there is a clique of five? How about this? Select a
new group of five nodes according to some rule we've specified. The details of the rule aren't important, but we do need to tell
the algorithm how to pick the set of five nodes. This
will be important later. Next, check if all the
selected nodes are connected by an edge, if they are all connected, we've found a clique of five
and the algorithm halts. If they are not all connected, the algorithm checks whether
it has tried all possible groups of five nodes. If it hasn't, it loops back to this step and selects a new group of five nodes. It repeats this process until
it finds a clique of five then halts. However, if it has tried all
possible groups of five nodes, it concludes that no clique
of five exists and halts. This isn't a particularly
clever algorithm, but the nice part is that it
works for any network of any number of nodes and any clique size. In other words, it's a
general algorithm for finding any clique in any network. So did we just solve the clique problem? No. This kind of algorithm
is known as brute force or exhaustive search, and
it has a few problems. For 20 nodes and a clique of
12, it would take less than one second to run on a computer. For 40 nodes and a clique of
24, it would take a few seconds to run. For 80 nodes and a clique
of 48 it would take a couple hundred years to compute. And for 160 nodes and
a clique of 96, well, by the time the algorithm stopped, all stars in the universe
would be long dead. As the input N grows larger, the runtime of the algorithm
grows exponentially. This is what makes clique a hard problem. It's what computer
scientists call intractable, meaning that for all practical purposes, we will never find a solution for large N. This is why the clique
problem remains unsolved. Computer scientists care
about how a problem scales as the input grows, and so far
no one has found an algorithm that scales faster than exponentially. So what would a solution look like? A cleverer, faster
algorithm. How much faster? One that scales polynomially
where an exponential has the input number in the exponent, a polynomial has the
input number in the base and a constant in the exponent. So as the input end grows,
the time needed to complete the algorithm grows much more slowly. The computer scientist
Jack Edmonds once said that, "For practical purposes,
the difference between polynomial and exponential
order is often more crucial than the difference between
finite and non-finite." So yeah, that's a pretty big difference. - So this whole distinction
between kind of exponential and polynomial seems rather
artificial to a lot of people because it is like it's, again, it just seems like two
very, very strange classes of functions. There are functions that are
in between the two as well, and if you have like an algorithm
that can solve a problem polynomially, no matter how
inefficient it might be, you know that it is doing
something that looks very, very different
than brute force search. And this is, it is kind
sort of one kind of important distinction. So you can kind of think of
exponential problems as sort of problems where you sort of just
try all the possibilities in some sense and polynomial
algorithms as those that are clearly kind of doing something else. - [Jade] All of these
problems share the property that no algorithm faster than
an exponential one has been found to solve them. So the statement solving one
of these problems would solve them all more concretely means
finding a polynomial time algorithm for one of these
problems would give way to finding a polynomial time algorithm for all of these problems. Now that we know what
NP-completeness means, we can dive into how it works. The next step is to understand
what exactly NP means. Although these problems
all look very different, they do have something in common. Yes, they are very hard to solve, but if we are magically
presented with a solution, we can very easily
check that it's correct. Kind of like finding a
needle in a haystack. It may take ages to find,
but once it's found, well, there's no need to
explain anything. There it is. So although they take an
exponential time to solve, they can be verified in polynomial time. In computational complexity
theory, problems are sorted into classes based on what types
of algorithms can solve them and what kinds of resources
these algorithms use. Like how much time or space. The Class P contains
problems, which can be solved and checked by a
polynomial time algorithm. The class EXP contains
problems which can be solved by an exponential time
algorithm and also take an exponential time algorithm
to check their solution. Our problems fit into the Class NP, which represent those which
can be solved by an exponential time algorithm, but checked by
a polynomial time algorithm. P for polynomial, EXP for exponential. But what does NP stand for? So the answer to this question
actually took me about three months to understand
with two computer scientists explaining it to me. So don't worry too much if
you don't get it right away. However, it is crucial to
understanding the entire premise of this video. NP stands for
non-deterministic polynomial, meaning that our problems can be solved by a non-deterministic
polynomial time algorithm. We already know what polynomial means. So how about non-deterministic? In computer science,
non-determinism models computation where the next step of the
computation is not determined by the current state of the algorithm. It'll make more sense
if we first talk about what deterministic computation means. Deterministic computation is the regular everyday computation. We're familiar with the kind
we built our clique algorithm with, one thing happens at
a time and there is only one possible next step
according to where we're at in the computation. The algorithm selects five nodes according to some specified rule. It then checks if they're all connected. If they're not, it checks
whether it has checked all possible groups of five nodes. If not, it loops back to this
step and repeats, easy peasy. The important thing is it
does one thing at a time and the next step is always determined. With non-deterministic computation, the next step is not always determined. Instead of going through
groups of five nodes one by one in a predetermined way, the algorithm non-deterministically
picks any five nodes. There is no rule, any five
can be picked at the next step and we can't predict what they will be. There are two ways to think about this. The first is that the non-determinism
causes the computation to branch out into all possible
choices at no extra cost with respect to time steps. It then checks all groups in
parallel to see if the nodes are connected. It does so in the same
amount of time steps that a deterministic
algorithm would take to check one group of five nodes. As long as one of the
branches return a yes, then the whole computation
will return a yes. The other way to think of
non-deterministic computation is that rather than trying
all possibilities out, the non-determinism magically
allows the algorithm to guess the correct group on the first try. Obviously, non-deterministic
computation is a lot faster than deterministic computation. We can skip all the time
checking all the paths one by one and just check all the paths in parallel. The talented computer
scientist Eric Domain said that "Non-determinism is
like engineering luck." Checking all paths in parallel
to find the right one is computationally equivalent
to guessing the right path on the first try. So as I mentioned, I found
non-deterministic computation hard to understand. It's not a realistic or
practical model of computation. So my query was why even bother
coming up with such a model? To help clarify, I've
brought in reinforcements. What's up with this
non-deterministic model? - So it's not meant to model realistic or practical computation,
but rather you can see it as a theoretical tool to
understand problems better. So the reason why it's
important is even though it's not a realistic model,
it gives us a more expressive model of computation and it
allows us to characterize all those problems that have in common, that the guessing is the
hard part and the reason why that's such a rich and deep
theory and actually useful even for practical
reasons, is that before, people would've been working
separately on all these, on all these problems
without seeing the sort of commonality that they had
between them and having this way to sort of characterize them
all basically shows that you're really studying the
same thing when you're studying all of them and the thing
that makes them different is from a computational point
of view is just detail. So that computationally,
if you can solve one, that means that you would've
solved this sort of abstract concept of guessing, which is why you would've
solved all of them. - So hopefully now you
have an idea of what the NP in NP-completeness means. We are referring to a class
of problems that can be solved in polynomial time if they have access to a non-deterministic algorithm. What's cool about this idea
is that it mathematically captures being hard to
solve, but easy to check, but that still doesn't explain why solving one of these problems
would solve them all. Why solving Candy Crush
would solve protein folding? For that, we need to
talk about completeness. To fully understand it, we need to go back to when
it all began, the '70s. This whole solving one problem
would solve them all thing didn't exist yet and computer
scientists were working on these problems in isolation. It was two cool dudes, Stephen Cook and Leonard
Levin that revealed their deep connection. The work was so groundbreaking
that Cook won a Turing award for it, the Nobel Prize
of computer science. The way they did it is
reminiscent of something Rene Descartes did 300 years earlier. You might know Descartes
for I think therefore I am, but he also did some
revolutionary mathematics. See, before Descartes, algebra
and geometry were completely separate fields of study. Then Descartes invented
the Cartesian plane and all of a sudden, algebraic
expressions could represent shapes and lines. Shapes and lines could be transformed into algebraic expressions. These were no longer two different things, but different representations
of the same thing. Cook and Levin did something
similar for computer science. They showed the connection
between a very famous NP problem called SAT and computation. In fact, they showed that
the SAT problem could carry out computation. This is a very strange idea, a mathematical problem doing computation. So now I'm gonna tell you
what the SAT problem is. We are going to clarify what
computation is and then, I'm going to show you
how a SAT problem can do a computation. To understand the SAT problem, you need to help me choose
what activities to do at the party. I have everything we need
for either xylography, yarn crafting, and or playing the zill. If we do yarn crafting, our
yarn art would be complimented really well by some wood engravings, so it would be crazy to
do yarn crafting without doing xylography. Xylography and playing
the zill will both take up a lot of time, so we
can't do both of those. On the other hand, my friend
Bob loves both yarn crafting and playing the zill, so we must do at least one of them. And I love arts and crafts, so I want to do either
xylography or yarn crafting, if not both. All of these constraints can
be summed up by this formula. This is called a boolean
formula because it consists of boolean algebra. Boolean algebra is made up of
and, or, and not operations. Unlike regular algebra where
variables can take on numerical values, boolean variables can
only take on the values of true or false, which we can
represent as one or zero. The and operation returns are
true only if all inputs are true, otherwise it returns all false. The or operation returns are
true if at least one input is true and then not operation,
which will represent by a bar above the variable simply flips the value. If X is true, not X is false, and if X is false, not X is true. This formula will tell us
what activities we can do at the party given the constraints. Each bracket corresponds
to a different constraint. Our aim is to find the right
combination of X, Y, and Z values so that the entire
formula returns a one. If we can do this, we say that the problem
has been satisfied. Hence, this type of
formula is called a boolean satisfiability problem or SAT for short. Because all the brackets
are connected by an and, that means we need to
choose the values such that the contents of each bracket equals one. Remember, the and operation
only returns a one if all inputs are one. This bracket, Y or Z,
represents the constraint that my friend Bob
loves both yarn crafting and playing the zill. So we must do at least one of them. The or operation returns a one
if at least one of the values is one. So here we can either have Y
equals one and Z equals zero. We can also have Y equals
zero and Z equals one. Or we can have both Y and Z equals one, but we cannot have Y and
Z equals zero as this would return a zero. Therefore, these three combinations
satisfy this constraint. In a similar way, this bracket X or Y represents
the constraint that I want to do either xylography or
yarn crafting, if not both. This bracket represents the
constraint that if we do yarn crafting, we must also
do xylography and this bracket represents the constraint that
we can't do both xylography and playing the zill. If we solve this formula
such that the whole thing equals one, this will tell
us what activities we can do at the party. SAT is an NP problem. We don't have a guaranteed
significantly better way to solve it than just to check
every single combination of true and false, a brute force
or exponential approach. Pause the video now if
you'd like to figure out the right combination of
true and false values for X, Y, and Z. The answer will be revealed
in about four seconds. We also know that it's an NP
problem because once presented with a solution, it's very easy
to check that it's correct. We just plug in the values
and see if the formula returns true or false. Looks like we're doing
xylography and yarn crafting, just what I wanted, but this
really doesn't look like a computer at all. How on earth could this do
everything a computer can do? - Boolean formulas are this
incredibly sort of expressive language. You can basically describe
all kinds of constraints, all kinds of, so just like
you know our language has the seeming capacity to
just describe everything. We can talk about, you know, trees and planets and
societies and everything. This whole sort of like
logical language sort of has the same kind of power when it comes to, at least when it comes
to sort of mathematics. You can sort of describe
any sort of objects, you can talk about
(indistinct), graphs, whatever. And so this whole question
of is a particular boolean formula satisfiable
basically often kind of boils down to the question of, you know, does a certain kind of object exist? Does the number with this property exist? Does a graph with this property exist? And so the satisfiability
question is sort of very, very powerful
because it is one where this universality is
perhaps most, most salient. - Cool, so that's the SAT problem. So now what Cook and Levin
did was show how a SAT formula could mirror the behavior
of a particular type of computational model. We all have some intuitive
idea of what computation is. For one, it's what these things do, but a vague idea isn't good enough. If we really want to examine
these bad boys and see what they're capable of, we need a
rigorous mathematical model. A mathematician named Alan
Turing came up with such a model called the Turing Machine. I've made a video all
about the Turing Machine and why it was needed, so
make sure to check that out if you'd like to know the full story. Here, we'll just stick to the basics. A Turing Machine consists
of several components that together capture our
intuition of computation. A tape, which is the
computer's way of keeping track of what it's working on. It contains symbols like ones and zeros. Then there's a head which
points to a cell on the tape and tells the computer what
part of the tape it is currently focusing on. The head can move left or right, it can write and erase
symbols at any moment in time. The machine is in a particular
state and it's looking at one particular square on the tape. For the machine to know
what to do at each state, it has an instruction manual. This tells the machine what
to do in its given state depending on what is
written on the square. For example, if it's in
state three and reads a one, it replaces the one with a zero,
moves the head to the right and enters state five. It then checks the
instruction manual to know what to do next. For example, in state
five, if it reads a zero, it leaves it as a zero,
moves the head right, and enters state 100. A Turing Machine mathematically
captures our intuition of what computation is. Although simple, they
are extremely powerful. A Turing Machine can do
everything a modern computer can do. We can extend our idea of
non-deterministic computation to Turing Machines and
imagine a non-deterministic Turing Machine. A non-deterministic Turing
Machine is like a Turing Machine, except that instead of only
being able to follow one instruction, the machine can
branch out and follow several. As long as one of the options is correct, the machine will find it. It follows that a non-deterministic
Turing Machine could solve any NP problem as they could run any non-deterministic algorithm. When we said that Cook and
Levin showed that a SAT formula could represent computation, they showed that a SAT
formula could represent a non-deterministic Turing Machine. That you can construct a
SAT formula to carry out any computation that any
non-deterministic Turing Machine could do. Now comes the crux of the
entire video showing how a SAT formula can
represent a Turing Machine, the so-called Cook-Levin Theorem. So Cook and Levin showed how a
SAT formula can represent any non-deterministic Turing Machine, but that is a very hefty process, which involves a lot of
complex math and is an entire lecture of a computer science degree. I've linked some videos
if you'd like to look at the complete process. However, here we are going
to convey the same idea with a very simple toy example. This Turing Machine contains
only one cell in the tape. In it can either be
written a one or a zero. If it's a one, the Turing
Machine returns a yes and halts. If it's a zero, the Turing
Machine returns a no and halts. The question we are asking is, does this Turing Machine return a yes? In this case, the tape
has a one written in it, so the machine will indeed return a yes. This machine only has three
states start, yes and no, and there are only two time
steps in the whole computation. The head also doesn't even need to move, the machine is that simple. The behavior of this Turing
Machine is captured by this formula. Whoa. Okay, let's break it down. Let me explain the notation. I've chosen the letter S
as a variable to represent the state of the Turing
Machine during each time step. This says that at timestep one, the machine is in the start state. This says that at timestep two, the machine is in the yes state. I've chosen the letter X to
represent what symbol is written on the square during each timestep. This says that at timestep
one, a one is written in the square. This says that at timestep,
one a zero is written in the square. So this bracket says that
if we are in the start state at timestep one, and the
machine reads a one on the tape, the machine moves into the
yes state at timestep two. This bracket says that if
we are in the start state at timestep one, and the machine
reads a zero on the tape, the machine moves into the
no state at timestep two. Like with our party example, we want to make this whole
formula true because we have an or here this time. This whole thing is true if
one of the brackets or if both of the brackets are true. Because all of these variables
are connected by ands, they all need to be true
for a bracket to be true. What this is telling us is that
for the formula to be true, one or both of the instructions
need to be followed, although it actually can't
be both in this case, but we'll get to that later. Hence, the variables
we need to be true for the whole formula to be true
are S start T equals one, X one T equals one,
and S yes T equals two, or S start T equals one,
X zero T equals one, and S no T equals two or all of them. But we're not quite done. There are some things the
machine does that this formula doesn't capture, like the fact that only one
symbol can be in the square at a time, or that the machine can only be in one internal state at a
time as it can't be in both the yes and no state. It also doesn't specify what
symbol is actually written on the tape before we run the machine, but not to worry because
Cook and Levin came up with formulas for all of these constraints too. To make sure only one symbol
is in the square at a time, we add this formula. The left bracket captures
the case where there is a one written in the tape and not a zero. For it to be true, X zero must be false
and X one must be true. The right bracket captures the
opposite case where a zero is written in the tape. For the bracket to be true, X one must be false and
X zero must be true. The or between the brackets
means that at least one of the brackets must be
true for the entire formula to be true. When both X zero T equals one and X one T equals one are true, both brackets are false
and hence the whole formula is equal to false. In other words, we cannot
have both a zero and a one in the square at timestep one. However, if X zero T equals one is true and X one T equals one
is false or vice versa, then one of the brackets is true and the whole formula is true. We add the equivalent
formula for timestep two. To represent the constraint
that the machine can only be in one internal state at a time, we add this formula. It looks long, but it's very simple. When more than one of the
S variables is set to true, the whole formula is false. However, when only one of
the S variables is true, the whole formula is true. We add the equivalent
formula for timestep two, hence the machine can only
be in one internal state at a time. And finally, our Turing Machine starts with
a one written on the tape. We represent that very simply
by adding X one T equals one. To ensure that all of
these conditions are met, we connect all the formulas by an and. This way for the whole
formula to turn out true, each sub formula must be true. This SAT formula represents
every single part of this Turing Machine. So we've just showed how a
SAT formula can represent this incredibly simple Turing
Machine by asking, does this formula have a
satisfying assignment of X, Y, and Z variables? We're effectively asking, will
this Turing Machine return a yes, Cook and Levin showed
how to build a SAT formula to represent any Turing Machine, even a non-deterministic
Turing Machine, that the SAT problem could answer any
question a non-deterministic Turing Machine could answer. So if we did find some
kind of way to solve the general SAT problem,
that would mean that we could simulate a non-deterministic
Turing Machine and remember how a non-deterministic
Turing Machine could solve all of the NP problems. So that tells us that
solving SAT would also solve all other NP problems. Thus SAT was the first
ever NP complete problem, a problem that could simulate a non-deterministic Turing Machine. This is incredible, but
it's not what I promised. SAT is just one problem and I
promised that solving any of these problems would solve them all. Here comes a new player in the game. Cook won a Turing award
for kicking off the whole NP complete movement, but
he shared his Turing award with another computer
scientist, Richard Karp. In a revolutionary paper, Karp showed that another 21
problems were NP complete in the same way that SAT was. One of these problems
was the clique problem from earlier in the video. That's right, the clique
problem can also simulate a non-deterministic Turing Machine. Since Karp's revolutionary paper
researchers have identified thousands of NP complete problems
in a wide range of fields such as cryptography, optimization, and artificial intelligence. NP completeness unifies
thousands of problems that were previously distinct, instead
of having many research teams trying to tackle thousands
of separate problems, the theory of NP completeness
exposes the fact that the fundamental difficulty
in solving these separate problems is the same
one, the guessing part and that solving it for
one problem would solve it for all others. This is why computer
scientists don't really think we'll ever solve these NP
problems because that would mean that non-determinism was not
more powerful than determinism, and that checking that something
is right is just as easy as finding the solution to it. That's a pretty big shift in worldview. However, no one has yet
proved that there isn't a faster algorithm for these problems. Making the question of
whether P equals NP, the biggest open question
in computer science. One of the most important
things about what Cook and Levin did is that now to show that
a problem is NP complete, you just need to take a
known NP complete problem and reduce it to your new problem. This is much easier than reducing it from a non-deterministic Turing Machine. In fact, when faced
with a new hard problem, it's often the first instinct
of computer scientists to try to reduce a known NP complete
problem to their new problem. This is so they know whether
or not to spend hours trying to solve it or whether
it joins the ranks of the infamous NP complete class. A reduction is one of the
most important techniques in theoretical computer science
and in a bonus video on Nebula, I show you how to reduce the SAT problem to the clique problem, proving
that the clique problem can also simulate a
non-deterministic Turing Machine. Watching this will give you
insight into how such different looking problems are actually
the same one in disguise, and you'll understand
the powerful techniques that computer scientists use. The reason I chose not
to include the reduction in this video is that
it's quite technical. The YouTube algorithm
tends to reward exciting and snappy content, but sometimes
you do need to go deeper into the math to really
understand and appreciate a topic, but I know a lot of you already know that and love a more in-depth explanation. So thankfully, I was still able to make
the bonus video on Nebula. Nebula is a streaming
platform made by myself and a bunch of educational
creator friends. What's amazing about it is
because it's a subscription service, we can make the content
we really want to without worrying about whether or
not the YouTube algorithm will reward us. So I can include the more
in-depth explanations that I know a lot of you love. Signing up to Nebula is the
best way to support this channel and the creation of
educational content in general. I'm deeply thankful to my
subscribers who are currently supporting me by being
subscribed to Nebula. There's a ton of exclusive high
production content there by some of your favorite creators, including Real Engineering,
Joe Scott, Real Science, and many more for you to discover. If you sign up with my link right now, you get free access to Nebula
classes where creators make classes about how to do what they do best. One class I highly recommend
is What Is Code taught by the coding (indistinct) Daniel Shiffman. For anyone who liked this video
and would like to learn more about the practical side
of computer science, it's a beginner friendly
introduction to the ins and outs of coding. If you sign up using the link on screen, which you can also find
in the description, you can support me directly
and get both Nebula and Nebula classes with a 40%
discount off the annual plan. That's just a little over
$2.50 a month for all of this exclusive, high quality content. As I mentioned, signing up
to Nebula is the best way to support the channel and the creation of educational content. Thank you for watching, and I'll see you in the next episode. Bye.
(upbeat music plays)