Lecture - 1 Overview of the course

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Welcome to the course on design and analysis of the algorithms. Today’s lecture is going to be an overview of the course and I want to introduce you to the main problems in the course and also give also convey a spirit of the course. Let us start with the fundamental question. Given a certain problem how do you solve it on a computer? For many problems it is relatively easy to design algorithms that will somehow solve them. However, quite some cleverness is needed and designing algorithms that are also fast that is algorithms which give answers very quickly. This will be the major challenge in the course. Our main course goal as said is to design algorithms which are fast. Design of fast algorithms. As you may realize designing anything be it computers, be it cars, be it clothes is an art. So, in some sense you have to be creative and it cannot be taught, but in some other sense there are also some very well defined design techniques which have evolved for this purpose. And the goal of this course is to study these techniques we will also have lots of exercises in which you can apply these ideas. And our hope is at the end of the course you will be able to solve algorithm design problems that may that you may encounter later on in your life. There are some prerequisites for this course and let me just state them. The prerequisites are that you should be familiar with some programming. You should have done some amount of programming in some common language say c or scheme or basic. You should also have done a course on data structures and you should have some discrete mathematics background. As you may see this is not an elementary course and this background is going to be necessary for us. Our approach is this course is going to be analytical. In the sense that will build the model of a computer analytical will build a model of a computer that is mathematical build a mathematical model. And on this mathematical model we will be designing algorithms. So, we design algorithms and we study their properties of algorithms on this model. The whole point is to reason about algorithms. So, this is very important if we want to mathematically prove properties and we want to prove properties the emphasis on this course is proof. Prove properties about speed say prove facts about time taken. All this so there is going to be a number of things that we have to do, and of course there is a whole course for this. So, I want to give an overview of the course. In the next few lectures we will develop the basic framework what I mean by this is that we will define the mathematical model. We will say what fast means when I say fast algorithm what do I mean? This is what we will say. Then we will embark on a fairly long stretch which involves techniques for designing fast algorithms while doing this we will also be surveying many problems. So, we will look at problems from optimization graph theory some problems from geometry also some others. It will turn out that there are some problems which do not really quite respond to our algorithm design techniques. Of course, we will see many problems for which we can design really god algorithms our techniques work beautifully, but there are also some problems where our techniques do not work so well. And for these problems a fairly intricate theory has developed over the last over the last several year’s 10 20 30 years and this theory is the, so called theory of N P completeness. So, we will be studying this theory as well. Let me now, come to the topic for today. The main topic for today. So, today I do not want to be too formal, but I nevertheless want to convey the spirit of the course to you. So, here is what we are going to do? We are going to take a fairly simple problem which everybody will have no should have no trouble in understanding and let me state that problem right away. The problem is given 2 numbers m and n find their greatest common divisor. So, everybody understands this problem. You are going to see 2 algorithms for this. One is a very simple algorithm which you probably have learnt in school. It is probably the algorithm that you were taught in first standard or fourth standard or something like that and which probably most of us used to solve this to find the greatest common divisor where we need to such as when simplifying fractions. Then we will study another algorithm and this is one of the earliest algorithms which was ever invented. And this was invented by the mathematician Euclid who you might know from plane geometry. Yes Euclid did invent this algorithm even though there were no computers in his time. So, we will study Euclid’s algorithm and from this we will get a sense of what a fast algorithm is you will see that Euclid’s algorithm. Even though we have not defined what fast is you will intuitively understand that Euclid’s algorithm must be much faster will be certainly much faster than the simple school level algorithm that we are talking about. And Euclid’s algorithm is also clever and that makes it more exciting and that is also again the spirit of this course. So, let me begin with the simple school level algorithm for factoring. So, everybody knows this, but let me state it anyway. So, basically there are 2 or 3 steps . Let me write down the input it is 2 integers m and n. What we need is the greatest common divisor which is also the largest integer that divides both and when I say divides I mean without leaving a reminder. So, here is what the algorithm will look like. So, step 1 we are going to factorize m. What does it mean? It means finding primes that we call them m 1 m 2 till m k such that m is m 1 times m 2 times all of this. The next step is to factorize n. What does that mean? Again find break n into its factors. So, write n as n 1 time n 2 times till some n j. Note that the same factor may appear several times and of course, we are here to write it separately. The next step is to identify common factors and then multiply and return the result. We are going to take an example of this and in a minute and we will see we will see what exactly these steps do. Let me now, state Euclid’s algorithm. So, I am going to write this as a procedure. So, Euclid is a procedure which will take 2 arguments m and n. And I am going to invent a pseudo language as I go along and that is again going to be in the spirit of the course we are not going to be too picky about how we write down algorithms? So, long as what I mean is clear to you perfect everything is fine. So, you could express your algorithms in the most in the most suitable nicest syntax. So, that it is easy for you to get your meaning across. We will come to all this in somewhat more detail in the next lecture or, so what does Euclid do? So, the first is a check. So, we are going to check whether m divides n while m does not divide n we are going to do the following. So, first we are going to calculate the remainder. So, we are going to calculate r of n mod m. Then we are going to set n equals m and then we are going to set m to be r. Al these steps are going to be done inside the loop. So, that is the end of the loop over here we will prove soon that eventually this loop will have to terminate. And after it terminates all that Euclid’s algorithm does is to return the value that m has at that point and that is it that is all there is to the algorithm. It is not clear when you first look at this algorithm that in fact, this algorithm works. It seems to be doing some divisions and taking some remainders, but it is not actually factorizing any of the numbers that you want who is greatest common divisors you want. So, let us now, take some examples and see whether or not this algorithm works. So, let us first take let us first take a small example where we have m equals say 36 n equals 48. How will you do it using our simple algorithm. Well we factor m is equal to a product of prime factors. So, the idea is going to be that we are going to test numbers one after another. So, we start with say 2 and yes 2 is a factor. So, we have to write it as 2.So, then we divides 36 by 2 we get 18. So. In fact, now, you have to factorize 18 well 2 is again a factor. So, then what remains is 9. Now, 2 is not a factor. So, you go out to the next prime number which is 3 which is a factor and then only the factor 3 remains. So, we have factorized m what about n? Again we do the same thing we start with 2 yes 2 is a factor. So, that gives us 24. Again 2 is a factor that gives us 12. Again 2 is a factor. So, that leaves us 6 again 2 is a factor and then that leaves us 3. So, then we want to identify what factors are common well these 2 are common and then 1 3 is also common. So, common the common factors are 2 2 and 3 and, so G C D is equal to 2 into 2 into 3 or 12 nothing terribly difficult in this. This is all of course, school level stuff. So, let us contrast this with what Euclid’s algorithm does. So, let me bring Euclid’s algorithm back. So, we are going to start with m equals 36, n equals 48. So, let me write it down as we go along. So, doe’s m we have to check whether m does not divide n. So, 36 divide n no, so when, so therefore we enter the loop. So, now we calculate r equals n mod m. So, in other words r is equal to 48 mod 36 that is equal to 12 then we calculate n is equal to m. Let me write down over here that well we know that m is equal to 36 and n is 48. So, n takes the value that m originally had. So, n is now, going to become 36 and m takes the value which r has, so r is going to become 12. So, at this point the iteration has ended. So, at this point our iteration has ended and we have left with m equals 36 and n equals 36 and m equals 12. But now, we have to go and check out the loop again, because well that is what the while loop says. So, this time n equals 36 and m equals 12 and again we had to check whether m does or does not divide 12 m r does not divide n. So, does 12 divides 36. This time the answer is yes and therefore, we are going to exit the loop. And in fact, at the end of it we are going to return the current value of m which is equal to 12, so we will be returning 12. So, let us now, compare these 2 algorithms. So, here you can see roughly what the work done by the 2 algorithms is. The simple algorithm had to do factorizations then it had to collect common factors and it had to do it had to multiply the factors together and return the answer. Euclid’s algorithm on the other hand did 1 division. So, first time it divided 36. It divided it checked if 36 divides 48 which is the value of n that we had. And when it found that answer was false the division was not possible then it took the remainder. Then it simply exchanged the numbers basically and then again it did 1 more division. So, I can summarize the work over here by saying that this did roughly 2 divisions. How much work does this do? Well it calculated these, so many factors. So, it calculated 4 factors, so it had to at least do 3 divisions to get each factor. Here it calculated 5 factors. So, it had to at least do 4 divisions. So, it at least did 9 divisions probably it did more. You can see in a quick example immediately where it will have to do many more than these. As you can see in this school level very simple factoring algorithm we use 9 divisions whereas, in this somewhat sophisticated algorithm we use really 2 divisions. So, clearly you have done less work. And although i have not proved yet that Euclid’s algorithm in fact, works you can see that it is returning the correct answer. So, but we will show later on that. In fact, Euclid’s algorithm does do does work correctly. So, let me take one more example m equals 434, n equals 966. So, these are somewhat large numbers and let us see what will happen over here. So, suppose you want to factorize these numbers first of all it will take some work, but let us see what the answer is going to be. So, m can be written as 2 times 7 times 31, n can be factorized as 2 times 7 times 139 what is the answer? The common factors are 2 and 7 and the G C D therefore, is 14 . Let us now, come back to Euclid’s algorithm. So, let me write down again m is equal to 434 and n is equal to 966. The first step is to compute the reminder of n mod m. So, then r is going to be according to this step r is going to be 966 mod 434. Well 966 is equal to 2 times 434 which is 868 plus 98 under 4 I can write down r as equal to 98. After this we just had to exchange values basically. So, now, n is going to take the old value of m. So, n is going to be 434 and m is going to take the value of r, so which is 98. So, at the end of one iteration of Euclid’s algorithm we have new values of m and n and these are the new values. So, we started off with the values 434 and 966. After 1 iteration we have the values 434 98 and 434 in that order m and n. So, now, we just have to repeat the same thing with these new values, so this is the end of the first iteration. And in the second iteration we are again going to calculate r equals n mod m. So, let us do that. So, if you calculate r equals n mod m we are going to get the new value of r to be equal to 434 mod 98. So, you have to do the division over here and which is 434 can be written as 98 times 4, so this is 392, so plus we are going to get 42. So, r is going to be equal to 42, but of course, when we began this iteration we had to check whether m does or does not divide n or whether 98 does or does not divide 434, but clearly it is leaving the remainder and therefore, we will enter this iteration. After that we are going to set n equal to m, so m the value of n is 98 and m will be equal to r which is 42. So, at the end of the second iteration these are the values that we have. So, n is equal to 98 and m is equal to 42. So, we will have to go into the third iteration for the beginning of the third iteration, let me just write down the values. So, we have m equals 42 n equals 98 again we are going to check does m divide n. It does not and, so we will enter the loop when we enter the loop we have to find out r equals n mod m. So, that is equal to 98 mod 42. 42 times 2 is 84, so 98 mod 42 is 14. Then there is just a matter of setting n and m correctly. So, n will now, equal the old value of m. So, this is 42 and m will equal the value of r we just calculated. So, it is going to be 14. So, these are the values at the end of the third iteration. So, now, again we are going to come back and execute this loop and again we are going to check does m divide n or not. And this time we will see that 14, in fact does divide 42 and at this point the loop will be exited. And at the end we are going to return the result as m. The value of m right now is 14 and therefore, 14 will be returned. So, what has happened? We took 3 iterations. So, summary of this all this is we took 3 iterations what happened in each iteration well we did 1 division per iteration. And before quitting from the loop we have to do one more iteration one more division and therefore, that took 4 divisions essentially. So, Euclid’s algorithm for this somewhat complicated problem m equals 434 n equals 966 where we are asking to find the greatest common divisor of m equals 434 and n equals 966 took 4 divisions. Let us now, check what happens with the simple algorithm. So, here where the factors that we found. So, how many divisions did it take? At first glance you might see you might think that it is only going to take 2 divisions to calculate m and 2 more divisions to calculate n, so which is 4 divisions, but that is not quite correct. To check that there is no factor between 2 and 7 we would require checking for 2 3 for 3 five as well. Now, after dividing by 7 we had to check whether 139 is a prime. So, that would also involve checking all the numbers until 39. So, that would require substantially many divisions. So, here again we require many divisions many more than Euclid’s algorithm. In fact, you will see that if we take bigger and bigger numbers factoring them becomes much harder. You will have to do a lot more work whereas, in Euclid’s case we will just do divisions and we are we will. In fact, show now that as we do the divisions then numbers will become smaller and smaller and the algorithm will terminate. So, the next topic that we are going to get into is I want to argue that Euclid’s algorithm actually works. I am not going to give a fairly very detail proof I just want to indicate the main idea. The main idea is really a fact about divisibility and remainder and things like that, so let me indicate that. So, this idea says that if m divides n then G C D of m and n must; obviously, be equal to m, because clearly m will be the largest number which divides both m and n. If not then we can write G C D of m and n is the same as G C D of n it is the same as the G C D of n mod m and m how do you prove this? I will leave it as an exercise, but let me just mention the main idea. The main idea is suppose g is the G C D then we can write m as a times g and n as b times g where a and b are relatively prime having written them in this manner you should be able to just substitute into what we have in the fact and then you should be able to get the answer. Actually the fact itself is very similar to what we had in Euclid’s algorithm in the fact that we have written down. So, let me just show you the iteration the loop part of Euclid’s algorithm. So, the fact says that if you want to calculate the G C D of m and n then you might as well calculate d G C D of n mod m and m what does our loop do? It wants to calculate we want to calculate d G C D of m and n. It first checks what the remainder is and then it calculates the fact is essentially n mod m. So, it is essentially this term over here and we need and then it sets m to be equal to r. So, the first argument is set to r and the second argument is set to the old value of m, but this is precisely what the fact says. The fact says that if you want to calculate G C D of m and n instead calculate G C D of n mod m and m. In fact, once we have once we are given this fact the proof of Euclid’s algorithm the correctness of Euclid’s algorithm is at least partially done. Because what we have accomplished is that we are sure that as we go through iterations we will never be we will always maintain integer’s m and n and who’s G C D will be the G C D of the original values of m and n. So, this is the invariant that we are going to maintain. So, m the specific value of m and n might change, but their G C D is never going to change. So, let me write this down. So, as the loop executes m and n might change, but their G C D does not. And hence what we have is that we will preserve the G C D and eventually when we exit the loop will exit with the same G C D. We just established that as the loop executes m and n might change, but curiously enough their G C D continues to remain the same. As a result if we ever get out of the loop then that will be, because m divides n, but notice that even at this point the G C D will still be the same as the old G C D the G C D of the old m and n. But if m divides n then the G C D will simply be n and that is what you will return and therefore, in fact, we have established that if the loop in fact, terminates then we will be returning the correct value. Now, you want to argue why the loop should terminate? Why we need to exit? Why we will always exit from the loop? This is actually fairly straight forward. Let us examine Euclid’s algorithm again and see what happens in each iteration. So, long as the loop is executing. So, initially the values of m and n might be taking some values. What are the values of m and n after one iteration? Well, the first step is calculating r, but notice that immediately afterwards we are setting n to have the value m. So, if the original values are m and n then at the end of 1 iteration n will have the value m and what will the value of m be m will have the value r. So, it will be n mod m. So, what has happened is that the values m and n have now, changed to n mod m and m, but n mod m anything mod m is actually going to be smaller than m itself. So, notice that the first argument is always continuously is going to decrease in every iteration of the loop. The value of m is always going to decrease. How long can it keep on decreasing? Well, it has to remain positive. It cannot even become zero and it might decrease. It will have to decrease at least by one in each iteration. As a result of this we can conclude that the loop has to exit at some point and we know that if the loop exits then the correct value is returned. So, that proves that Euclid’s algorithm is correct. Our next step is to argue that Euclid’s algorithm in fact, runs reasonably quick. So, the basic idea we want to prove is going to be something like this. Well before that we are going to assume we are going to we are going to prove something about cost Euclid’s algorithm and in doing this we are going to assume that the first argument that we sent to Euclid is always smaller than the second. So, this means that we will always call G C D 36 48 and not G C D of 48 36. This is only for the purpose of analysis. Actually the algorithm as we wrote will work fine if make this call as well. If we make this call you will see that internally we will start with m equals 48 n equals 36, but in one iteration we will be exchanging these values. The first iteration will only be spent for exchanging these values. So, m will become 36 and n will become 48. So, then we can as good as assume that in fact, we will start off our execution in this manner. So, we will assume in fact, that we are not analyzing this because that only adds 1 iteration. So, our assumption is that we are analyzing calls of this kind where m is less than n well of course, if m is equal to n then there is no question. So, the loop wills the Euclid will immediately return . So, let me now state the main result that we want to prove. The main result that we want to prove is something like this. So, if Euclid is called with values m with values p and q that is Euclid of p q is called I want to make a distinction between the variables m n and their values. So, we will think of p q as values and m n will be variables. So, you are going to call Euclid with values p and q then if p is less than q then in each iteration the sum of the values of variables m and n will decrease at least by a factor 3 halves. So, initially if the sum is 6 then in one iteration the sum must go below 4 become 4 or go below 4. If it is 16 in 1 iteration it must go below 40 or stay or become 40. What is the implication of this theorem? This theorem establishes that the sum will drop very fast. And in fact, the number of iterations this establishes that the number of iterations is equal to log to the base 3 halves of p plus q. Well, it is less than equal to, so this theorem will put a good upper bound on the number of iterations that Euclid is going to execute. And therefore, all that we need to do is to prove this theorem and we have a bound on how many iterations Euclid takes. So, our goal now, is to estimate what happens to the sum of the values taken by m and n in each iteration. So, let us say that we start off the whole process and at the beginning of the iteration m and n take values p and q. So, at the beginning of iteration m is equal to p and n is equal to q. After the iteration and I mean just one iteration let us say m takes the value p prime n takes that value q prime. So, what is our goal we want to estimate by how much the whole thing drops. So, you want to estimate p plus q upon p rime plus q prime. So, if we prove that this ratio is at most 3 halves then we are done. So, all that remains now, is to express p prime and q prime in terms of p and q. So, for that will need our Euclid procedure again. So, what does the Euclid procedure do if it does not terminate and that is the case that we are looking at currently? Then it computes r equals n mod m and then set n equals m. So, in terms of this the new value of n is the old value of m. So, going back to this the new value of n must be the old value of m. So, therefore we get q prime must be equal to p. The new value of m is the value of r which is the old value of m mod old value of n mod old value of m. The old value of n is simply q mod m. The old value of m is p and therefore, this whole thing p prime is q mod p we need one more fact in addition to all this which concerns p prime plus q prime. Well, what is p prime plus q prime? P prime is the remainder when q is divided by p and q prime is p itself. So, this is remainder plus divisor and we know that the divisor is strictly less than the dividend. Divisor in our case is p and the divided is q and we assume that p is in fact, less than q. And therefore, we can conclude that this whole thing has to be less than or equal to the dividend. In other words this is q, so here are the 3 facts that we wanted p prime plus q prime is less than q. Let me align it q prime is equal to p and p prime is less than q mod p. If p prime is less than q mod p then I can conclude that p prime has to be less than both. In fact, it has to be less than p. So, now, it is just a matter of algebra we are going to add this and this and this last inequality, but just to get the terms right we will multiply this last inequality by 2 times. So, if we do that let me adjust this. So, if we do that addition we are going to get p prime plus q prime plus p prime plus q prime this comes from over here. The whole thing is less than this p plus this p plus this q, but let us takes this 2 times therefore, we will get 2 q. So, now it is just a matter of simplifying this. So, what is this equal to this is nothing but 3 times p prime plus q prime and we have shown that it is less than 2 times p plus q and this is exactly what we wanted. So, we have been we have showed that the original value of the sum has reduced by a factor of two-thirds. So, p prime plus q prime is less than p plus q upon 3 halves that concludes the analysis of the algorithm. And we have been able to show that in fact, the algorithm will execute a fairly small number of steps. So, let me now conclude this lecture and highlight the main points. The very first point that I want to make is that is the difference between the 2 algorithms. The school level algorithm basically uses the definition. Euclid’s algorithm uses some more interesting deeper properties deeper mathematical properties of the quantities that we are going to calculate. So, we study properties of whatever we computing and this helps in designing fast algorithms. There is also one more point that I want to make which is that the basic way in which we did this analysis counting iterations will be useful in the rest of the course and the precise details of all of this we will cover in the subsequent lectures. So, that marks the end of this lecture.

Info

Channel: nptelhrd

Views: 408,473

Rating: undefined out of 5

Keywords: Overview, of, the, course

Id: 5Y8Lfsreeck

Channel Id: undefined

Length: 51min 50sec (3110 seconds)

Published: Tue Aug 26 2008