Dijkstra's Hidden Prime Finding Algorithm

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
edar dyra is renowned for his significant contributions to computer science most notably his famous pathfinding algorithm however when you think of his name prime number generation is probably not the first thing that comes to mind commonly prime numbers are generated using the famous C veritos for its incredible speed or trial division for its reduced space requirement while these two algorithms are great for their specific use cases what you gain on one side you typically pay for on the other but what if we could somehow combine the strengths of these two algorithms hidden in some old notes of his dyra did just that prime numbers are defined as whole numbers that are greater than one and are divisible by only one and themselves for example here are all the prime numbers that are less than 100 all remaining numbers are known as composite numbers the distribution of prime numbers amongst whole numbers lacks an apparent pattern posing a significant challenge due to their unpredictable nature it can be difficult to identify prime numbers within a set of whole numbers this challenge has captivated the minds of mathematicians for thousands of years prompting a race to try to develop the most efficient Prime finding algorithm one of the oldest of these algorithms is known as trial division here's how it works what we're going to do is find all the prime numbers under 30 so first of all what's common amongst most Prime finding algorithms is that we're just going to go ahead and Skip one because one is not considered a prime number so let's move on to two so the trick for this algorithm is for each number we want to check to see if it's a multiple of any of our currently found prime numbers well as you see we haven't found any primes yet so two is not a multiple of any of our primes so we add it to the list next we move on to three and we ask ourselves is three a multiple of any of our primes three is not a multiple of two so we add it to our list of primes next we have four and again we need to check if it's a multiple of any of our primes however we don't actually have to check if that's the case for every Prime that we found so far only up to the square root of the number we're testing testing so the square root of 4 is 2 so we only need to check if four is a multiple of two don't worry I'll explain why this is in just a moment to check if four is a multiple of two we can say 4 modulo 2 which is zero if you don't know what the modulo operator is what we're saying here is if we divide 4 by two what is the remainder in this case Zero is the remainder so this means that four is a multiple of two because it divides out evenly so four fails the test here therefore it's not prime next we move on to five again we only need to check if it's a multiple of any of our primes up to the square root of five so the Square < t of five lands somewhere here between two and three so we're only looking at two here so if we do five modulo 2 or what is the remainder if we divide 5 by two which is one since we get a nonzero remainder here we know that five does not divide out evenly by two so it's not a multiple of two so it passes the prime test and we add it to our running list of primes okay so at this point you might be asking yourself why we only check if the current number is a multiple of the prime numbers that we've already found rather than all the numbers up to that point and further why only up to the square root of the number being tested why are we not checking if it's a multiple of all of our current primes so first the reason we only need to check if the current number is a multiple of prime numbers is efficiency and due to the nature of prime numbers themselves if we take any composite number at random they can be factored down to a set of primes for this reason prime numbers are known as the building blocks of natural numbers so for testing a number that cannot be factored down to prime numbers that must mean that the number itself is prime okay fair enough but why are we only checking the prime numbers up to an including the square root of the number being tested to understand this let's ask ourselves what are the factors of some number let's say 24 so first I want to Mark where the square root of 24 is because at this point something interesting happens so first of all 1 is a factor it's 1 * 24 next two is also a factor it's 2 * 12 3 is a factor it's 3 * 8 and four is a factor 4 * 6 now you'll notice that after this point all remaining factors have technically already been identified by the factors less than the < TK of 24 they're just mirrored this will actually always be the case when looking at multiples of a number so as another example let's say we want to see if 23 is a number which spoiler alert it is so let's mark where the square root of 23 is and the only factors of 23 are one and itself which is the property of being a prime number notice there are no other factors here before the sare < T of 23 so therefore there won't be any factors after the square < TK of 23 either so for the sake of efficiency there's no need to continue checking for factors after this point okay so hopefully that answered most of your questions on some of the methodology being used here so back to where we left off the next number we're testing is six the square root of six lands somewhere here between two and three so again we're only checking if 6 is a multiple of two 6 modulo 2 is zero so we know 6 is a multiple of two so it's not prime next is seven 7 is not a multiple of two because 7 / two gives us a non-zero remainder and so we add it to our list of primes next is eight we know immediately 8 is a multiple of two so let's just move on next is n so the square root of 9 is three so we need to check if nine is a multiple of two or three so 9 / 2 gives us a remainder of one but 9 / 3 gives us a remainder of zero so 9 is a multiple of three therefore it's not prime so we move on next is 10 again we can immediately see that 10 is a multiple of two so let's move on next is 11 so we need to check if 11 is a multiple of 2 or three so 11 / 2 gives us a remainder of 1 and 11 / 3 gives us a remainder of two both are nonzero remainders so 11 is prime so we add it to our list now this pattern will just continue onward but just as a final example let's see if 29 is prime the square root of 29 is somewhere here between 5 and 7 so we only need to see if it's a multiple of 2 3 or 5 29 ID 2 gives us a remainder of 1 29 ided 3 gives us a remainder of two and 29 ID 5 gives us a remainder of four all of these are nonzero remainders so 29 passes the test here and we add it to our list of primes and of course 30 can't be prime because it's a multiple of 2 three and five and so there we have it that's how we use trial division to find prime numbers the next method we're going to explore is the SE of aatases this is one of the most popular methods and by far the fastest so to start this algorithm creates a Boolean array that is the size of the max number we are checking so since we're checking all primes up to 30 this array is length 30 and initially we want to set all of these to True next we have our empty list of primes down here so again we can skip one because it's not considered a prime number so let's go ahead and set its position to false in this Boolean array next we go to two and since two is true in the Boolean array we know immediately that it's a prime number so we add it to our list after it's been added though we need to set every multiple of two in this Boolean array to false this is because later when we get to those numbers we know immediately that they're not prime numbers because they're each a multiple of two next we go to three again three is true in our Boolean array so we can instantly add it to our list of primes then we set each multiple of three to false next we go to four and since four is false in our Boolean array we know it was set to false because it's a multiple of one of our previously found prime numbers so we can move on next we have five which is true in the Boolean array so again we add it to our primes list and then we set each of its multiples defaults if they're not fals already next is six which is false in the Boolean array so we know it's not prime so we move on next we have seven which is true in the Boolean array so we add it to our list of primes this time however we don't need to worry about setting any of Seven's multiples to false because they've already been set to false by previous primes this is because seven comes after the square root of 30 in this case this is derived from the logic I described previously when explaining trial division so if you take a look at Seven's multiples we've got 14 which is set to false by two we've got 21 which is set to false by three and we've got 28 which is set to false by two so at this point our Boolean array is complete and each remaining True Value is the rest of our Prime numbers so 8 9 and 10 we can skip 11 is prime 13 is prime 17 19 23 and 29 are the rest of our prime numbers so there we have it that's how the C veranes is used to find prime numbers as you can tell it's an extremely elegant and fast solution now if we compare these two methods trial division is a much slower algorithm because for each number we're testing we have to divide it by some of our primes to check if it's a multiple but it's more space efficient because it's not having to store a large Boolean array like the SE of ostanes however like most things it's a trade-off this extra memory consumption allows the SE to perform much faster Additionally the seeve Finds Its multiples of primes using multiplication not by performing a division operation like trial division performing a division operation can be slightly computationally slower than performing a multiplication operation building upon these wellestablished Prime finding algorithms Dy decided to venture Beyond these conventional paths faced with a prevailing sentiment amongst his peers who often remarked but everyone knows that the most efficient way to generate prime numbers is by using the SE of veritos dyra saw an opportunity to not just accept the status quo but to question it he recognized the merits of these algorithms but he was driven by a desire to explore further his approach includes a pool and a list of primes again like the other methods we'll Skip One and go on to two since we know it's not Prime here we need to make another assumption we're assuming two is prime so we're just going to add it to our list of primes now once we've added a prime number to our list we also need to add it to our pool along with the square of the Prime together as a tuple on the top row here we have our primes and the bottom row will be a running counter of the multiples associated with that Prime all right so next we go to three and what we want to do is find the smallest multiple in our pool which is currently four if the number we are currently on is less than that smallest multiple which in this case it is three is less than four then we add that number to our list of primes and again once we add a prime we add it and at Square to the pool next we go to four so we find our smallest multiple in our pool which is four in this case if the number we are on is equal to the smallest multiple in the pool then the number is not prime because that means it's a multiple of a prime so instead we need to increment our smallest multiple by the prime number It's associated with so we add two here to get six next we go to Five the smallest multiple now is six and since five is less than six that means five is prime so we add it to the list along with its Square in the pool next is six our smallest multiple is six which is equal to the number that we're on so six is not Prime and we need to increment this multiple by its prime number two next is seven seven is less than our smallest multiple so seven is prime next is eight 8 is equal to our smallest multiple so it's not Prime and we increment our smallest multiple 8 by two next is nine and this time our smallest multiple is nine they are equal so nine is not prime so we need to increment this nine by the prime number It's associated with so we add three to get 12 next is 10 it's equal to our smallest multiple so it's not Prime and we increment our smallest multiple next is 11 it's less than our smallest multiple which is 12 so we add 11 to our primes list next is 12 which is equal to our two smallest multiples so 12 is not Prime and we increment both of these multiples by their Associated prime numbers next is 13 it's less than our smallest multiple so we add it to our primes list now this pattern will just continue onward and again just as a final example we get to 29 Which is less than our three smallest multiples so we add it to the list of primes and so there we have it this was the brilliant approach that dyra had taken on one hand we're just using simple addition to keep track of the multiples of prime numbers not division like we did in trial Division and we're also using a smaller data structure to keep track of our multiples compared to what we used in the C of veranes instead of keeping track of every multiple at once dyr method only keeps up with the multiples that we need as we go so I've created a python script to compare the space and time efficiency of these three methods to find all the prime numbers less than 1 million when I run this you'll see that the data structures needed in the SE of aatosan takes up about 8.6 megabytes while trial division takes up much less at about 0.6 megabytes Dyer's approach Falls somewhere in the middle at about 1.2 megabytes on the other hand the SE was much faster finishing at a fraction of a second while trial division took over 3 seconds and again Dyer's approach Falls somewhere in between if you want to play around with this code I posted it in my patreon feed linked below so I ran this experiment a few more times times using all three methods to find prime numbers up to 5 million here we have trial division having a really small space requirement but taking over 25 seconds when checking 5 million numbers then we have the seeve which is a complete opposite trade-off clocking in at less than a second but having a much larger space requirement and Dyer's approach is a really nice balance between the two being very close to the small space requirement of trial division while at the same time fairly close to the small amount of time needed for the C ver CES so here's my conclusion in most cases space efficiency is far less important than time efficiency especially nowadays with the large amounts of memory we have available on our machines so with that said in my opinion the C veritos is probably the more preferred solution in most cases its space requirement while relatively much larger than the other two methods is still not much of an issue however if you start trying to find primes up to around 1 billion then it's memory dependent definitely starts to become an issue all of these algorithms definitely have their time in place none of these methods are considered the one size fits-all solution to finding prime numbers it all depends on the scenario which was beautifully noted by Dyer himself I do not claim that my final program will be the best one measured by whatever yard stick any of my readers might care to [Music] choose if you want to play around with the code in the video or check out Dyer's notes I've uploaded them to my patreon page linked below this is where I'll start uploading code Snippets polls Early Access videos and more so head on over to check it out it's free to join the community and if you're feeling generous you can pick up a membership which helps me out a ton thanks for watching and I'll see you all in the next video
Info
Channel: b001
Views: 155,062
Rating: undefined out of 5
Keywords: python, programming, tech, software, engineering, program, technology, coding, code, coder, programmer, engineer, software engineer, computer, computer programming, tips, codelife, computers, primes, prime, dijkstra, prime-finding, finding, sieve, Eratosthenes, sieve of eratosthenes, trial, division, trial division, algorithm, algorithms, computer science, science, numbers, composite, number
Id: fwxjMKBMR7s
Channel Id: undefined
Length: 15min 48sec (948 seconds)
Published: Sun Feb 11 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.