P vs. NP: The Biggest Puzzle in Computer Science

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Is it possible to invent a computer that computes anything in a flash? Or do problems exist that would stump even the most powerful computers imaginable? How complex is too complex for computation? These questions are central to a fascinating conundrum called the P versus NP problem. P versus NP, is one of the great unsolved problems in all of math and computer science, or, you know, really in all of human knowledge, if we're being honest. Anyone who finds a solution could win a handsome $1 million prize offered by the Clay Institute. And this solution could lead to breakthroughs in everything from medicine to artificial intelligence, and even the perfect game of Mario Brothers. But it would come at a cost. A definitive answer to the P versus NP problem could break your bank account and even lead to the end of the internet as we know it. To find out why, let's start with a simple logic puzzle. A robot arrives in a foreign land where everyone either always tells the truth or always lies. The robot reaches a fork in the road with two choices. One path leads to safety in the land of truth tellers, while the other leads to doom. In the land of liars. A sentry appears. But it's unclear which group they belong to. What question can the robot ask to determine the safe route? Here's some options... And here's the answer. With this one question, both types of sentries would point to the correct safe path, allowing the robot to solve the problem quickly. Now, what if the robot encounters multiple sentries and dozens of pathways leading who knows where? What can it do to find safe passage? Compared to the first problem, is a correct solution here proportionately more difficult to find? Or is it exponentially harder? Maybe there's a shortcut that could speed things up? Or maybe finding a solution to this complex situation is nearly impossible. The question of how hard a problem is to solve lies at the very heart of an important field of computer science called computational complexity. Computational complexity is the study of the inherent resources such as time and space that are needed to solve computational problems, such as factoring numbers, for example. And especially it's the study of how those resources scale as the problems get bigger and bigger. Computational complexity theorists want to know which problems are solvable using clever algorithms. And which problems are truly difficult, maybe even practically impossible for computers to crack. So how do computers solve problems in the first place? A computer's core function is to compute. These machines perform basic mathematical operations like addition and multiplication faster than any human. In 1936, a 23-year-old British University student, Alan Turing, developed a fundamental theory of computation. While attempting to solve a problem about the foundations of math, Turing claimed in a paper, it's possible to invent a machine which can compute any computable sequence given enough time and memory. This simple, theoretical Turing machine, which includes a place to store information and a device to read and write new information based on a set of instructions, is the basic framework for all digital computers to come. One of the key insights of computer science is that Turing machines are mathematically equivalent to just about any other programming formalism that anyone has invented, at least for conventional computers. Inside a digital computer, data are represented as binary bits, sequences of ones and zeros. And these bits can be manipulated through logical operations using a branch of math that preceded the invention of computers called Boolean algebra. The basic rules of Boolean algebra were formulated in 1847 by English mathematician, George Boole. Boolean algebra formulates decision problems, yes or no problems that can be asked in sequence to answer complicated questions. The output of a Boolean function for any given input can be represented in what's called a truth table, and it is either one or zero. True or false. Expressions in Boolean logic consist of multiple variables linked together by three logic gates: AND OR NOT. These logic gates work like this. For an AND gate, when the inputs for both variables X and Y are true, the expression evaluates to true otherwise it's false. With the OR gate, the expression evaluates to true when either X or Y is true. A NOT gate simply inverts the value of a variable, switching a true to false or vice versa. By using only these three simple, logical operations and building them up into various configurations, Boolean formulas can operate like a Turing machine and theoretically solve any computable problem. In 1937, an electrical engineering graduate student, Claude Shannon, demonstrated in his master's thesis that the Boolean operations AND, OR, and NOT can be calculated using electronic switching circuits. A decade later, Bell Labs introduced the first solid state semiconducting transistor. Transistors are simple, electronically controlled switches with three wires, a control wire, and two electrodes to input and output data. They can be in one of two states, on or off, representing either a one or zero. True or false. Transistors can be combined and arranged in different configurations to mimic the Boolean logic gates of AND OR, and NOT. By combining enough of these transistors together on computer chips in complex arrangements called circuits, it's theoretically possible to compute almost anything, anything that is computable, that is, More about that later. In the 1950s, Hungarian American mathematician and computer scientist, John von Neumann brought the modern computer one step closer to reality. He developed the architecture of the universal electronic computer, the same architecture at work inside most electronic computing devices today, from smartphones to supercomputers. Since the mid 1950s when the first transistor based electronic computers were built, the technology has swiftly advanced. by scaling down and cramming more and more transistors onto tiny chips, computing power and speed has nearly doubled every two years. Today, computers can perform trillions of calculations per second. But even that's not always good enough. There's a class of problems that computers can never solve. A quandary Alan Turing foresaw in the same paper where he conjured up his calculating machine. Turing had the insight and in fact proved that not everything is computable. The limiting factor is not the hardware, but instead the software or program. Computers solve problems by followinglists of instructions called algorithms. An algorithm just means a step-by-step procedure for solving a problem. Running an algorithm is analogous to following the steps of a recipe or an IKEA assembly manual. Here's an example of an algorithm that sorts a list of numbers from lowest to highest. It works like this. Take a random list of numbers. Start with a first number from the left, which here is six. Now look through the remaining numbers and identify the smallest one, which in this case is three. Next, swap the places of six and three, and you'll have the smallest number in the group on the left. Take a second pass this time, starting with the second number from the left. Swap with the smallest and so on. After enough passes, all the numbers are sorted, lowest to highest using this simple algorithm. Any problem that can be solved by an algorithm is computable. But by the 1970s, computer scientists realize that not all computable problems are created equal. Some turned out to be easy, while others seemed hard. For problems like multiplication, computer scientists found really fast algorithms, but for others, like optimizing the operations in a factory, they couldn't find any easy solutions. And it turned out that some problems were so hard, the only known way to solve them could end up taking more solution steps than there are subatomic particles in the universe. Thus, the P versus NP problem was born. So what exactly does P and NP mean? P problems, or problems that can be solved in polynomial time are the types of problems that are relatively easy for computers to solve. The vast majority of what we use our computers for on a day-to-day basis, you know, you could think of as solving various problems in P. From finding the shortest path between two points on a map. to sorting a list of names alphabetically, or searching for an item on a list. These are all examples of polynomial P problems. A polynomial is a mathematical function that can contain variables raised to a fixed power or exponent like N to the power of two or n cubed. The time it takes a computer to solve P problems grows polynomially as the input increases in size. Given enough computing power, all problems in P can be solved by a computer in a practical amount of time. Now, NP problems are a class of problems that share a unique feature. If given the solution, it turns out to be quick and easy to verify If it's correct. Easily solved P problems are contained within the class of all NP problems because they can also be verified relatively quickly in polynomial time. However, there's a class of NP problems that are easy to check, yet seem difficult to solve in the first place. It's really helpful to think about something like a jigsaw puzzle or a Sudoku puzzle, right? We all have experience that, you know, these could require an enormous amount of trial and error to solve. But nevertheless, you know, if someone says that they solved it, it's much easier to check whether they did. If someone gives you a completed hard puzzle like a large Sudoku or crossword, the answers can be verified much fasterthan it can be solved in the first place. It's the same for a computer. The reason these types of NP problems can be more difficult for a computer to solve is because as far as we know, their complexity increases exponentially as their input increases. An exponential function has the variable in the exponent like two to the power of N. Here's an example of exponential versus polynomial growth. For the same input. With these hard, exponential NP problems, the increased complexity of large inputs can quickly exceed the limits of what a computer can compute in a reasonable amount of time. And solving them using brute force techniques alone is practically impossible. You could be doing that, you know, from from now until the whole universe has degenerated into black holes and you wouldn't have even made a dent in it. Over the years, mathematicians discovered clever polynomial algorithms for some seemingly difficult NP problems. Proving that these problems were actually in the simpler class P and making them solvable by a computer. This opened the door to the question, are all NP problems really P problems? Could every problem that is seemingly intractable for a computer to solve today turn out to be easy in the future? P equaling NP would have far ranging consequences. Solutions to nearly any problem would be well within our reach. AI would become smarter overnight. Businesses could solve complicated optimization and logistics problems boosting the global economy. And scientists could make once unthinkable breakthroughs. Maybe we could even find a cure for the common cold? However, if P equals NP, that would also mean that all current methods for strong encryption security measures that protect everything from your online privacy to your crypto wallet would instantly become obsolete. It would be hacker heaven. In the 1970s, mathematicians Stephen Cook and Leonid Levin working independently on opposite sides of the Iron Curtain, made important discoveries that have come to define the P versus NP problem. They discovered the idea of NP Completeness. Almost all of the notoriously difficult problems in NP are actually equivalent. So if you could prove one of those is equal to P, you've solved all of P versus NP. This is kind of like a veterinarian realizing that even though toy poodles and St. Bernard's look very different, they're in fact the same species and a treatment that works on one would very well work on the other. There are hundreds of known NP Complete problems, and finding a single solution would lead to breakthroughs on multiple fronts, including physics, economics, and biology. Not to mention computer science itself. NP Complete problems include a host of famous problems you may have heard of, including the Knapsack Problem, which involves the most efficient way to pack items like in a suitcase. Or the Traveling Salesman problem, which involves route planning and navigation. Complicated NP Complete problems, even underlie everyday tasks like figuring out how to deliver millions of Amazon packages on time or efficient in life-saving matching of organ donors with recipients, and even mastering games like Tetris or Candy Crush, All known NP Complete problems can be transformed into one another. The most famous of these is what's called the Boolean Satisfiability problem, or SAT. SAT is one of the most famous problems in computer science. Besides its theoretical interest, it's a very fundamental workhorse problem in applied computer science, especially in software verification. SAT is a decision problem that asks: for a given Boolean formula or expression made up of N variables and the logical operators AND, OR, and NOT. Is there a combination of true-false variable assignments for which the entire formula evaluates to true? If so, then it is said to be satisfiable. If someone can devise a clever fast algorithm for the SAT problem making it easy to compute, then voila! Proof that P equals NP. However, most computer science researchers believe that P doesn't equal NP. And proving P doesn't equal NP has turned out to be one of the hardest problems in math and computer science. In the 1980s, one promising avenue of research emerged called circuit complexity. The field studies the complexity of Boolean functions when represented as circuits. The behavior of any given Boolean function can be described by its truth table. However, the same truth table can be produced by circuits of differing complexity as seen in these two examples. A Boolean function's circuit complexity is defined as the total number of logic gates in the smallest circuit, which can compute that function. Researchers study circuit complexity to understand the limits of computation and to optimize the design of algorithms and hardware. For some Boolean functions, the minimum number of logic gates grows polynomially as the number of input variables increases. These are said to have low circuit complexity and are analogous to P-class computational problems. Functions where the number of necessary logic gates grows exponentially with increasing input variables are said to have high circuit complexity. One potential approach for proving P doesn't equal NP required researchers to identify a single function known to be in the class of NP, that's also definitely has high circuit complexity. In 1949, Claude Shannon proved that most Boolean functions have high circuit complexity. So how hard could it be to find one instance? Harder than it sounds. Alas researchers encountered a mathematical roadblock called the Natural Proofs Barrier. This barrier implies that any proof that P doesn't equal NP using known circuit complexity techniques would have to have a bizarre and self-defeating character. Clearly, a different way forward was needed. While most researchers started looking for other approaches, some began to investigate the Natural Proofs Barrier itself, leading them to new questions. How hard is it to determine the hardness of various computational problems? And why is it hard to determine how hard it is? It's a very meta area of computer science called meta-complexity. It has turned out that a lot of the progress that one can make on problems like the P versus NP problem today, you know, involves sort of self-referential, arguments, involves sort of turning inward. Given the limits of known techniques, meta-complexity researchers are searching for new approaches to solve some of the most important unanswered questions in computer science. And crucially meta-complexity is intimately connected to the question of whether provably secure cryptography schemes even exist. One important current focus of research is what's called the Minimum Circuit Size Problem, or MCSP. MCSP is interested in determining the smallest possible circuit that can accurately compute a given Boolean function. Is there a simple solution? The Minimum Circuit Size Problem is a problem about circuit complexity, but how complex is the problem itself? Researchers suspect that MCSP is NP complete, but unlike many other similar problems, they haven't been able to prove it. A proof of MCSP's NP-completeness would be a big step towards showing that secure cryptography does exist. And recently there's been tantalizing progress towards that goal. Like the robot searching for safe passage in a foreign land, the pursuit of meta-complexity is blazing a trail for theoretical computer science. A path that may lead to an answer to whether P equals NP or not. If civilization lasts long enough, you know, I would tend to think that probably problems like P versus NP will someday be solved, and my main uncertainty is, is it humans who solve them or is it AI? Is it GPT 10 that solves the P versus NP problem for us? And then will we be able to understand the solution if it does?
Info
Channel: Quanta Magazine
Views: 669,711
Rating: undefined out of 5
Keywords: science, quanta, quanta magazine, explainer, science explainer, science video, educational video, artificial intellgence, computer science, computational complexity, computational complexity theory
Id: pQsdygaYcE4
Channel Id: undefined
Length: 19min 44sec (1184 seconds)
Published: Fri Dec 01 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.