Lagrange Multipliers : Data Science Basics

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hello everybody in this video we're gonna be talking a little bit about LaGrange multipliers so if some of you are cringing right now that's because you're probably a student of economics or stats or linear algebra and you find yourself constantly having to use these but not really remembering the theory behind them so in this video I will not be getting to the theory because I think it's a little bit heavy to get into that right now but we're gonna be talking about how to use Lagrange multipliers when you know it's time to use the Lagrange multiplier and considerations like that so we're gonna be talking more about the how to execute a Lagrange multiplier problem and less about the theory you may make a Theory video in the future but that's just not what this particular video will be about all right so here's the setup we are explorers of a new land and our goal is to discover new species so all of our team is right here at the origin and of course we put a u1 u2 grid so that's the two basic directions and of course we can go in any direction that's combination of u1 and u2 the question is which direction should we go in to increase our success of finding a new species let's say our chief biologist super smart guy has devised this formula for us which tells us based on which direction we go in based on the u1 and u2 components of the direction we are choosing to go in it tells us our success of finding a new species right and we want to of course maximize this success because we want to go in the direction that's giving us the most success of finding a new species the formula is given by U transpose s u where s is a 2 by 2 matrix 1 negative 1 3 1 and U is simply the direction we're choosing to go in so it's a 2 by 1 vector with the U 1 and u 2 component of the direction we're choosing to go in now I'm not going to go over all the computations which lead us from the U transpose su to what that formula is algebraically because I think based on the matrix multiplication type operations we've done in past videos you can do that I'm confident but in the end we find that if we find the closed form algebraic Ellucian after doing all these vector matrix multiplications it's equal to u1 plus u2 squared so we want to find the direction that maximizes this seems simple enough maybe we can just do a calculus problem but there's something fishy going on already right let's say we choose to go in this direction right here with this u 1 and this u 2 we plug those into the formula and we get some number now let's say we double those components and we go to u 1 and 2 u 2 now our answer shouldn't change right our success should not change because we're end up going in the same direction we just chose to represent that with an arrow that was double the magnitude but it ends up that if we put 2 u 1 in here and 2 u 2 in here then the two can get factored out it gets squared of course and we end up with 4 times the amount of success that doesn't really make any sense right we don't want that to happen because if we were to do a traditional calculus problem a traditional maximization problem on this it would tell us to set u 1 and u 2 as both as positive infinity because if we do that we're gonna get infinity as our success and that's amazing but that doesn't help us understand which direction we're supposed to go in right so it becomes pretty clear here pretty fast that we need some form of constraint we need to make sure that we're not allowing any possible u 1 and u 2 values that constraint is going to be that we're only going to allow let me make a little bit of room here cancel some of these things out we're only going to allow any u 1 u 2 combination which lies on the unit circle another way of saying that is if this vector U is the one we're choosing to go in we need the magnitude of vector U to be equal to 1 that is the same thing as saying it lies on the unit circle because that means that all we're carrying now is about the direction we're going in we no longer care about the magnitude of that vector because we set the magnitude of that vector in fact we could have set it as any number it just 1 is the most convenient to use in mathematical operations and now you also see why I call these vectors u because I knew that we're going to set the magnitude equal to 1 so I just call them U for a unit vector okay so that is the constraint let's write it a little bit more formally mathematically and then we'll talk about how Lagrange multipliers come in and help us solve our problem okay so our problem after the constraint is taken into account ends up being this guy we want to maximize u transpose su because that's our measure of success based on our super smart scientist friend but now subject to so s T stands for subject to or such that it basically says given the constraint this the constraint being u transpose u is equal to 1 now that looks slightly different than the constraint I gave before before I was saying I want magnitude of U to be equal to 1 but u transpose U is just the magnitude of the vector u squared + 1 squared is just 1 ok so this is really the same constraint you can verify that for yourself as well well I can just give you a quick verification u transpose u since you only has two elements u 1 and u 2 u transpose U is going to be u1 squared plus u2 squared and that's the same thing as the square of the magnitude of U because that's given by the same formula ok cool so that's our constraint now here is how you actually use a Lagrange multiplier you go ahead and write your objective function the function you're trying to maximize or minimize u transpose s u u du plus lambda lambda being the actual Lagrange multiplier itself and then you go ahead and take your constraint move everything onto one side of the equal sign so we get 1 minus u transpose u ok so this is our new function and we basically treat this as an unconstrained problem we have already internalized the constraint we have taken the constraint and put it into this new formula so that now we can go ahead and just operate on this guy like we would a normal calculus problem we just take the derivative and set it equal to zero ok so let's do that let me clear some room on this side of the board okay so if I take the derivative of this guy we know how to take matrix derivatives right if you don't go ahead and watch the video on matrix derivatives that we have that'll help you a lot in understanding what's about to happen if we take the derivative of this guy we saw this exact form in that video so it's going to be - it's gonna be - s you if we take the derivative of this guy the lambda is a constant so we're going to get minus lambda the - coming from here and the derivative of U transpose U is just given by two times you set that equal to zero a lot of simplifications we can make we have a 2 here that can go away a two here that can just go away let's move the lambda u over to that side so we get su is equal to lambda u now this is really cool and let me tell you why what does this look like to you this is of course the equation of a line vector because remember any eigenvector of matrix s is going to satisfy the formula s you you being the eigenvector is equal to some scalar multiple lambda of the vector u so we have found that you any direction we're going in on the plane has to be an eigenvector of matrix s that's already a powerful statement so now we're only able to consider eigen vectors of matrix s of course the question is which i ghen vector of matrix s should be used because it's gonna have potentially a few of them right to figure that out we're gonna do another fancy trick that fancy trick being we're gonna go ahead and left multiply each side of this equation by u transpose so we're gonna get u transpose s u is equal to lambda u transpose u and what is u transpose u well it has to be equal to 1 we know that we set it so that means that we can just get rid of u transpose u my board's obviously not erasing as well as I want it to but that's gone so we find that u transpose se U is equal to lambda why is that significant because the thing we're trying to maximize in the beginning was U transpose su which is the thing right here if we're trying to maximize this and this thing is equal to lambda we're also trying to basically find the maximal lambda but what is lambda from our previous step we saw that lambda is an eigenvalue of matrix s so we basically just want to take all the eigenvalues of matrix s and find the biggest one because if we find the biggest eigenvalue that's going to maximize lambda that's going to maximize this and that's exactly what we want in the first place so it's a little bit of mathematical gymnastics to get from here back to there back to there but hopefully you can see from these calculations we've done that if I were to pick the biggest eigenvalue lambda of matrix s and then find its corresponding eigenvector u that's going to be the direction I need to head in in order to have the highest measure of success based on my constrained maximization problem and that's how you use Lagrange multipliers that's about it they just come in when you have a maximization or minimization problem but you also want to set some kind of constraints so things don't get out of control you basically take that constraint you internalize it into the problem and then you go ahead and you take your derivative as you always would okay and that is about it that's how you that's how you use a Lagrange multiplier okay and we're going to be using this in our principal component analysis video so until next time
Info
Channel: ritvikmath
Views: 22,778
Rating: 4.9576721 out of 5
Keywords:
Id: 6oZT72-nnyI
Channel Id: undefined
Length: 10min 0sec (600 seconds)
Published: Sat Sep 21 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.