NP-COMPLETENESS - The Secret Link Between Thousands of Unsolved Math Problems

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

- [Jade] This video was made possible by Nebula. What is the fastest way to solve any Rubik's cube? How do you complete a level of Super Mario Brothers in the shortest possible time? How can we predict how a protein will fold? These questions hold the key to unlocking unimaginable advancements in technology, security, and optimization. Yet to this day, they remain unsolved. They're part of a group of over a thousand problems that share an amazing quality solving Solving just one of them would solve all of them. Because of this, there's a $1 million prize waiting for whoever solves one. But why would finding the fastest way to solve any Rubik's cube tell you the optimal place to put a hospital? Why would finding a way to solve any Sudoku puzzle tell you the cheapest way to fly around the world? Hi everyone. Welcome to Up and Atom, I'm Jade. In this video, we are going to explore how some of the world's most challenging problems are all connected in a deep and mysterious way, and why even though a solution to any one of them would radically transform our world, they remain unsolved. To understand the theory of NP-completeness, let's start at the beginning. What does this claim actually mean? What kind of problems are these and what would it mean to solve one? Let's take this one. You want to organize a party. Unfortunately, not all of your friends get along. You'd like things to go smoothly, so you want to make sure that you don't invite two people who clash and you want to invite at least three people. Can you do it? We can represent your friend network as a graph where each friend is a node. If they get along, they're connected by an edge. If they don't get along, they're not connected. Clearly, the problem is that Dennis and Bob don't get along. So you can either invite Alice, Bob, and Charlie or Alice, Dennis, and Charlie. This example was small enough to easily figure out, but what if you were actually popular and your friend network looked like this? Here you have 20 friends. Can you throw a party of 12 so that everyone gets along? Now, it's not so obvious. This is a famous problem in computer science called clique. A clique refers to a set of nodes that are all connected by an edge. So in this case, the question is can you find a clique of 12? However, when computer scientists refer to the clique problem, they don't just mean one specific instance like 20 nodes with a clique of 12. They mean for every instance of the problem, for any graph with any number of nodes and for any clique size, how can you answer what is essentially infinitely many questions? An algorithm is a set of step-by-step instructions that you can mindlessly follow, kind of like a cake recipe. Computers are very good at carrying out algorithms because they're good at processing simple instructions very quickly. Let's see if we can come up with an algorithm to solve this specific clique problem. This network has 10 nodes. Can we write an algorithm to tell us if there is a clique of five? How about this? Select a new group of five nodes according to some rule we've specified. The details of the rule aren't important, but we do need to tell the algorithm how to pick the set of five nodes. This will be important later. Next, check if all the selected nodes are connected by an edge, if they are all connected, we've found a clique of five and the algorithm halts. If they are not all connected, the algorithm checks whether it has tried all possible groups of five nodes. If it hasn't, it loops back to this step and selects a new group of five nodes. It repeats this process until it finds a clique of five then halts. However, if it has tried all possible groups of five nodes, it concludes that no clique of five exists and halts. This isn't a particularly clever algorithm, but the nice part is that it works for any network of any number of nodes and any clique size. In other words, it's a general algorithm for finding any clique in any network. So did we just solve the clique problem? No. This kind of algorithm is known as brute force or exhaustive search, and it has a few problems. For 20 nodes and a clique of 12, it would take less than one second to run on a computer. For 40 nodes and a clique of 24, it would take a few seconds to run. For 80 nodes and a clique of 48 it would take a couple hundred years to compute. And for 160 nodes and a clique of 96, well, by the time the algorithm stopped, all stars in the universe would be long dead. As the input N grows larger, the runtime of the algorithm grows exponentially. This is what makes clique a hard problem. It's what computer scientists call intractable, meaning that for all practical purposes, we will never find a solution for large N. This is why the clique problem remains unsolved. Computer scientists care about how a problem scales as the input grows, and so far no one has found an algorithm that scales faster than exponentially. So what would a solution look like? A cleverer, faster algorithm. How much faster? One that scales polynomially where an exponential has the input number in the exponent, a polynomial has the input number in the base and a constant in the exponent. So as the input end grows, the time needed to complete the algorithm grows much more slowly. The computer scientist Jack Edmonds once said that, "For practical purposes, the difference between polynomial and exponential order is often more crucial than the difference between finite and non-finite." So yeah, that's a pretty big difference. - So this whole distinction between kind of exponential and polynomial seems rather artificial to a lot of people because it is like it's, again, it just seems like two very, very strange classes of functions. There are functions that are in between the two as well, and if you have like an algorithm that can solve a problem polynomially, no matter how inefficient it might be, you know that it is doing something that looks very, very different than brute force search. And this is, it is kind sort of one kind of important distinction. So you can kind of think of exponential problems as sort of problems where you sort of just try all the possibilities in some sense and polynomial algorithms as those that are clearly kind of doing something else. - [Jade] All of these problems share the property that no algorithm faster than an exponential one has been found to solve them. So the statement solving one of these problems would solve them all more concretely means finding a polynomial time algorithm for one of these problems would give way to finding a polynomial time algorithm for all of these problems. Now that we know what NP-completeness means, we can dive into how it works. The next step is to understand what exactly NP means. Although these problems all look very different, they do have something in common. Yes, they are very hard to solve, but if we are magically presented with a solution, we can very easily check that it's correct. Kind of like finding a needle in a haystack. It may take ages to find, but once it's found, well, there's no need to explain anything. There it is. So although they take an exponential time to solve, they can be verified in polynomial time. In computational complexity theory, problems are sorted into classes based on what types of algorithms can solve them and what kinds of resources these algorithms use. Like how much time or space. The Class P contains problems, which can be solved and checked by a polynomial time algorithm. The class EXP contains problems which can be solved by an exponential time algorithm and also take an exponential time algorithm to check their solution. Our problems fit into the Class NP, which represent those which can be solved by an exponential time algorithm, but checked by a polynomial time algorithm. P for polynomial, EXP for exponential. But what does NP stand for? So the answer to this question actually took me about three months to understand with two computer scientists explaining it to me. So don't worry too much if you don't get it right away. However, it is crucial to understanding the entire premise of this video. NP stands for non-deterministic polynomial, meaning that our problems can be solved by a non-deterministic polynomial time algorithm. We already know what polynomial means. So how about non-deterministic? In computer science, non-determinism models computation where the next step of the computation is not determined by the current state of the algorithm. It'll make more sense if we first talk about what deterministic computation means. Deterministic computation is the regular everyday computation. We're familiar with the kind we built our clique algorithm with, one thing happens at a time and there is only one possible next step according to where we're at in the computation. The algorithm selects five nodes according to some specified rule. It then checks if they're all connected. If they're not, it checks whether it has checked all possible groups of five nodes. If not, it loops back to this step and repeats, easy peasy. The important thing is it does one thing at a time and the next step is always determined. With non-deterministic computation, the next step is not always determined. Instead of going through groups of five nodes one by one in a predetermined way, the algorithm non-deterministically picks any five nodes. There is no rule, any five can be picked at the next step and we can't predict what they will be. There are two ways to think about this. The first is that the non-determinism causes the computation to branch out into all possible choices at no extra cost with respect to time steps. It then checks all groups in parallel to see if the nodes are connected. It does so in the same amount of time steps that a deterministic algorithm would take to check one group of five nodes. As long as one of the branches return a yes, then the whole computation will return a yes. The other way to think of non-deterministic computation is that rather than trying all possibilities out, the non-determinism magically allows the algorithm to guess the correct group on the first try. Obviously, non-deterministic computation is a lot faster than deterministic computation. We can skip all the time checking all the paths one by one and just check all the paths in parallel. The talented computer scientist Eric Domain said that "Non-determinism is like engineering luck." Checking all paths in parallel to find the right one is computationally equivalent to guessing the right path on the first try. So as I mentioned, I found non-deterministic computation hard to understand. It's not a realistic or practical model of computation. So my query was why even bother coming up with such a model? To help clarify, I've brought in reinforcements. What's up with this non-deterministic model? - So it's not meant to model realistic or practical computation, but rather you can see it as a theoretical tool to understand problems better. So the reason why it's important is even though it's not a realistic model, it gives us a more expressive model of computation and it allows us to characterize all those problems that have in common, that the guessing is the hard part and the reason why that's such a rich and deep theory and actually useful even for practical reasons, is that before, people would've been working separately on all these, on all these problems without seeing the sort of commonality that they had between them and having this way to sort of characterize them all basically shows that you're really studying the same thing when you're studying all of them and the thing that makes them different is from a computational point of view is just detail. So that computationally, if you can solve one, that means that you would've solved this sort of abstract concept of guessing, which is why you would've solved all of them. - So hopefully now you have an idea of what the NP in NP-completeness means. We are referring to a class of problems that can be solved in polynomial time if they have access to a non-deterministic algorithm. What's cool about this idea is that it mathematically captures being hard to solve, but easy to check, but that still doesn't explain why solving one of these problems would solve them all. Why solving Candy Crush would solve protein folding? For that, we need to talk about completeness. To fully understand it, we need to go back to when it all began, the '70s. This whole solving one problem would solve them all thing didn't exist yet and computer scientists were working on these problems in isolation. It was two cool dudes, Stephen Cook and Leonard Levin that revealed their deep connection. The work was so groundbreaking that Cook won a Turing award for it, the Nobel Prize of computer science. The way they did it is reminiscent of something Rene Descartes did 300 years earlier. You might know Descartes for I think therefore I am, but he also did some revolutionary mathematics. See, before Descartes, algebra and geometry were completely separate fields of study. Then Descartes invented the Cartesian plane and all of a sudden, algebraic expressions could represent shapes and lines. Shapes and lines could be transformed into algebraic expressions. These were no longer two different things, but different representations of the same thing. Cook and Levin did something similar for computer science. They showed the connection between a very famous NP problem called SAT and computation. In fact, they showed that the SAT problem could carry out computation. This is a very strange idea, a mathematical problem doing computation. So now I'm gonna tell you what the SAT problem is. We are going to clarify what computation is and then, I'm going to show you how a SAT problem can do a computation. To understand the SAT problem, you need to help me choose what activities to do at the party. I have everything we need for either xylography, yarn crafting, and or playing the zill. If we do yarn crafting, our yarn art would be complimented really well by some wood engravings, so it would be crazy to do yarn crafting without doing xylography. Xylography and playing the zill will both take up a lot of time, so we can't do both of those. On the other hand, my friend Bob loves both yarn crafting and playing the zill, so we must do at least one of them. And I love arts and crafts, so I want to do either xylography or yarn crafting, if not both. All of these constraints can be summed up by this formula. This is called a boolean formula because it consists of boolean algebra. Boolean algebra is made up of and, or, and not operations. Unlike regular algebra where variables can take on numerical values, boolean variables can only take on the values of true or false, which we can represent as one or zero. The and operation returns are true only if all inputs are true, otherwise it returns all false. The or operation returns are true if at least one input is true and then not operation, which will represent by a bar above the variable simply flips the value. If X is true, not X is false, and if X is false, not X is true. This formula will tell us what activities we can do at the party given the constraints. Each bracket corresponds to a different constraint. Our aim is to find the right combination of X, Y, and Z values so that the entire formula returns a one. If we can do this, we say that the problem has been satisfied. Hence, this type of formula is called a boolean satisfiability problem or SAT for short. Because all the brackets are connected by an and, that means we need to choose the values such that the contents of each bracket equals one. Remember, the and operation only returns a one if all inputs are one. This bracket, Y or Z, represents the constraint that my friend Bob loves both yarn crafting and playing the zill. So we must do at least one of them. The or operation returns a one if at least one of the values is one. So here we can either have Y equals one and Z equals zero. We can also have Y equals zero and Z equals one. Or we can have both Y and Z equals one, but we cannot have Y and Z equals zero as this would return a zero. Therefore, these three combinations satisfy this constraint. In a similar way, this bracket X or Y represents the constraint that I want to do either xylography or yarn crafting, if not both. This bracket represents the constraint that if we do yarn crafting, we must also do xylography and this bracket represents the constraint that we can't do both xylography and playing the zill. If we solve this formula such that the whole thing equals one, this will tell us what activities we can do at the party. SAT is an NP problem. We don't have a guaranteed significantly better way to solve it than just to check every single combination of true and false, a brute force or exponential approach. Pause the video now if you'd like to figure out the right combination of true and false values for X, Y, and Z. The answer will be revealed in about four seconds. We also know that it's an NP problem because once presented with a solution, it's very easy to check that it's correct. We just plug in the values and see if the formula returns true or false. Looks like we're doing xylography and yarn crafting, just what I wanted, but this really doesn't look like a computer at all. How on earth could this do everything a computer can do? - Boolean formulas are this incredibly sort of expressive language. You can basically describe all kinds of constraints, all kinds of, so just like you know our language has the seeming capacity to just describe everything. We can talk about, you know, trees and planets and societies and everything. This whole sort of like logical language sort of has the same kind of power when it comes to, at least when it comes to sort of mathematics. You can sort of describe any sort of objects, you can talk about (indistinct), graphs, whatever. And so this whole question of is a particular boolean formula satisfiable basically often kind of boils down to the question of, you know, does a certain kind of object exist? Does the number with this property exist? Does a graph with this property exist? And so the satisfiability question is sort of very, very powerful because it is one where this universality is perhaps most, most salient. - Cool, so that's the SAT problem. So now what Cook and Levin did was show how a SAT formula could mirror the behavior of a particular type of computational model. We all have some intuitive idea of what computation is. For one, it's what these things do, but a vague idea isn't good enough. If we really want to examine these bad boys and see what they're capable of, we need a rigorous mathematical model. A mathematician named Alan Turing came up with such a model called the Turing Machine. I've made a video all about the Turing Machine and why it was needed, so make sure to check that out if you'd like to know the full story. Here, we'll just stick to the basics. A Turing Machine consists of several components that together capture our intuition of computation. A tape, which is the computer's way of keeping track of what it's working on. It contains symbols like ones and zeros. Then there's a head which points to a cell on the tape and tells the computer what part of the tape it is currently focusing on. The head can move left or right, it can write and erase symbols at any moment in time. The machine is in a particular state and it's looking at one particular square on the tape. For the machine to know what to do at each state, it has an instruction manual. This tells the machine what to do in its given state depending on what is written on the square. For example, if it's in state three and reads a one, it replaces the one with a zero, moves the head to the right and enters state five. It then checks the instruction manual to know what to do next. For example, in state five, if it reads a zero, it leaves it as a zero, moves the head right, and enters state 100. A Turing Machine mathematically captures our intuition of what computation is. Although simple, they are extremely powerful. A Turing Machine can do everything a modern computer can do. We can extend our idea of non-deterministic computation to Turing Machines and imagine a non-deterministic Turing Machine. A non-deterministic Turing Machine is like a Turing Machine, except that instead of only being able to follow one instruction, the machine can branch out and follow several. As long as one of the options is correct, the machine will find it. It follows that a non-deterministic Turing Machine could solve any NP problem as they could run any non-deterministic algorithm. When we said that Cook and Levin showed that a SAT formula could represent computation, they showed that a SAT formula could represent a non-deterministic Turing Machine. That you can construct a SAT formula to carry out any computation that any non-deterministic Turing Machine could do. Now comes the crux of the entire video showing how a SAT formula can represent a Turing Machine, the so-called Cook-Levin Theorem. So Cook and Levin showed how a SAT formula can represent any non-deterministic Turing Machine, but that is a very hefty process, which involves a lot of complex math and is an entire lecture of a computer science degree. I've linked some videos if you'd like to look at the complete process. However, here we are going to convey the same idea with a very simple toy example. This Turing Machine contains only one cell in the tape. In it can either be written a one or a zero. If it's a one, the Turing Machine returns a yes and halts. If it's a zero, the Turing Machine returns a no and halts. The question we are asking is, does this Turing Machine return a yes? In this case, the tape has a one written in it, so the machine will indeed return a yes. This machine only has three states start, yes and no, and there are only two time steps in the whole computation. The head also doesn't even need to move, the machine is that simple. The behavior of this Turing Machine is captured by this formula. Whoa. Okay, let's break it down. Let me explain the notation. I've chosen the letter S as a variable to represent the state of the Turing Machine during each time step. This says that at timestep one, the machine is in the start state. This says that at timestep two, the machine is in the yes state. I've chosen the letter X to represent what symbol is written on the square during each timestep. This says that at timestep one, a one is written in the square. This says that at timestep, one a zero is written in the square. So this bracket says that if we are in the start state at timestep one, and the machine reads a one on the tape, the machine moves into the yes state at timestep two. This bracket says that if we are in the start state at timestep one, and the machine reads a zero on the tape, the machine moves into the no state at timestep two. Like with our party example, we want to make this whole formula true because we have an or here this time. This whole thing is true if one of the brackets or if both of the brackets are true. Because all of these variables are connected by ands, they all need to be true for a bracket to be true. What this is telling us is that for the formula to be true, one or both of the instructions need to be followed, although it actually can't be both in this case, but we'll get to that later. Hence, the variables we need to be true for the whole formula to be true are S start T equals one, X one T equals one, and S yes T equals two, or S start T equals one, X zero T equals one, and S no T equals two or all of them. But we're not quite done. There are some things the machine does that this formula doesn't capture, like the fact that only one symbol can be in the square at a time, or that the machine can only be in one internal state at a time as it can't be in both the yes and no state. It also doesn't specify what symbol is actually written on the tape before we run the machine, but not to worry because Cook and Levin came up with formulas for all of these constraints too. To make sure only one symbol is in the square at a time, we add this formula. The left bracket captures the case where there is a one written in the tape and not a zero. For it to be true, X zero must be false and X one must be true. The right bracket captures the opposite case where a zero is written in the tape. For the bracket to be true, X one must be false and X zero must be true. The or between the brackets means that at least one of the brackets must be true for the entire formula to be true. When both X zero T equals one and X one T equals one are true, both brackets are false and hence the whole formula is equal to false. In other words, we cannot have both a zero and a one in the square at timestep one. However, if X zero T equals one is true and X one T equals one is false or vice versa, then one of the brackets is true and the whole formula is true. We add the equivalent formula for timestep two. To represent the constraint that the machine can only be in one internal state at a time, we add this formula. It looks long, but it's very simple. When more than one of the S variables is set to true, the whole formula is false. However, when only one of the S variables is true, the whole formula is true. We add the equivalent formula for timestep two, hence the machine can only be in one internal state at a time. And finally, our Turing Machine starts with a one written on the tape. We represent that very simply by adding X one T equals one. To ensure that all of these conditions are met, we connect all the formulas by an and. This way for the whole formula to turn out true, each sub formula must be true. This SAT formula represents every single part of this Turing Machine. So we've just showed how a SAT formula can represent this incredibly simple Turing Machine by asking, does this formula have a satisfying assignment of X, Y, and Z variables? We're effectively asking, will this Turing Machine return a yes, Cook and Levin showed how to build a SAT formula to represent any Turing Machine, even a non-deterministic Turing Machine, that the SAT problem could answer any question a non-deterministic Turing Machine could answer. So if we did find some kind of way to solve the general SAT problem, that would mean that we could simulate a non-deterministic Turing Machine and remember how a non-deterministic Turing Machine could solve all of the NP problems. So that tells us that solving SAT would also solve all other NP problems. Thus SAT was the first ever NP complete problem, a problem that could simulate a non-deterministic Turing Machine. This is incredible, but it's not what I promised. SAT is just one problem and I promised that solving any of these problems would solve them all. Here comes a new player in the game. Cook won a Turing award for kicking off the whole NP complete movement, but he shared his Turing award with another computer scientist, Richard Karp. In a revolutionary paper, Karp showed that another 21 problems were NP complete in the same way that SAT was. One of these problems was the clique problem from earlier in the video. That's right, the clique problem can also simulate a non-deterministic Turing Machine. Since Karp's revolutionary paper researchers have identified thousands of NP complete problems in a wide range of fields such as cryptography, optimization, and artificial intelligence. NP completeness unifies thousands of problems that were previously distinct, instead of having many research teams trying to tackle thousands of separate problems, the theory of NP completeness exposes the fact that the fundamental difficulty in solving these separate problems is the same one, the guessing part and that solving it for one problem would solve it for all others. This is why computer scientists don't really think we'll ever solve these NP problems because that would mean that non-determinism was not more powerful than determinism, and that checking that something is right is just as easy as finding the solution to it. That's a pretty big shift in worldview. However, no one has yet proved that there isn't a faster algorithm for these problems. Making the question of whether P equals NP, the biggest open question in computer science. One of the most important things about what Cook and Levin did is that now to show that a problem is NP complete, you just need to take a known NP complete problem and reduce it to your new problem. This is much easier than reducing it from a non-deterministic Turing Machine. In fact, when faced with a new hard problem, it's often the first instinct of computer scientists to try to reduce a known NP complete problem to their new problem. This is so they know whether or not to spend hours trying to solve it or whether it joins the ranks of the infamous NP complete class. A reduction is one of the most important techniques in theoretical computer science and in a bonus video on Nebula, I show you how to reduce the SAT problem to the clique problem, proving that the clique problem can also simulate a non-deterministic Turing Machine. Watching this will give you insight into how such different looking problems are actually the same one in disguise, and you'll understand the powerful techniques that computer scientists use. The reason I chose not to include the reduction in this video is that it's quite technical. The YouTube algorithm tends to reward exciting and snappy content, but sometimes you do need to go deeper into the math to really understand and appreciate a topic, but I know a lot of you already know that and love a more in-depth explanation. So thankfully, I was still able to make the bonus video on Nebula. Nebula is a streaming platform made by myself and a bunch of educational creator friends. What's amazing about it is because it's a subscription service, we can make the content we really want to without worrying about whether or not the YouTube algorithm will reward us. So I can include the more in-depth explanations that I know a lot of you love. Signing up to Nebula is the best way to support this channel and the creation of educational content in general. I'm deeply thankful to my subscribers who are currently supporting me by being subscribed to Nebula. There's a ton of exclusive high production content there by some of your favorite creators, including Real Engineering, Joe Scott, Real Science, and many more for you to discover. If you sign up with my link right now, you get free access to Nebula classes where creators make classes about how to do what they do best. One class I highly recommend is What Is Code taught by the coding (indistinct) Daniel Shiffman. For anyone who liked this video and would like to learn more about the practical side of computer science, it's a beginner friendly introduction to the ins and outs of coding. If you sign up using the link on screen, which you can also find in the description, you can support me directly and get both Nebula and Nebula classes with a 40% discount off the annual plan. That's just a little over $2.50 a month for all of this exclusive, high quality content. As I mentioned, signing up to Nebula is the best way to support the channel and the creation of educational content. Thank you for watching, and I'll see you in the next episode. Bye. (upbeat music plays)

Info

Channel: Up and Atom

Views: 448,339

Rating: undefined out of 5

Keywords:

Id: ctwX--JEzSA

Channel Id: undefined

Length: 33min 3sec (1983 seconds)

Published: Fri Mar 31 2023