Data Mining Lecture 21: Spectral Methods for Graph Partitioning

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
all right so spectral methods for graph partitioning our focus ultimately will still be social network kind of graphs but the methods we are going to explain uh now don't necessarily have to be applied on those kinds of graphs they are pretty general and in fact uh in the next class the some of the spectral clustering methods will see they are they don't even have to be applied to graphs they can be applied to other kind of data also okay so let's start so firstly let's recall uh orthogonality of eigenvectors for symmetric matrices so let's start with that so first of all remember this we all remember right so mbr let m be a square matrix a vector v is an eigenvector of m with eigen value lambda if m v is equal to lambda v all right so with that definition what we have is if m is a symmetric matrix and there are two eigen vectors v 1 and v 2 of m with the corresponding eigen values lambda 1 and lambda 2 where the eigen values are different then we can have that the eigen vectors are orthogonal to each other all right so we can actually see a quick proof so for that we need to prove that v1 transpose v2 is actually the scalar 0. now since uh lambda 1 v 1 so we can write lambda 1 v 1 transpose v 2 as this lambda 1 v 1 transpose can be written as lambda v 1 whole transpose right so lambda is just a scalar so lambda v 1 whole transpose v 2 right so lambda v 1 is just m v 1. so m v 1 whole transpose is v 1 transpose m transpose v2 now we can group the second part and since m transpose is same as m because m is symmetric this is same as v1 transpose mv2 now m v 2 can be made into lambda 2 v 2 v 2 is also an eigen vector so m p 2 is lambda 2 v 2. so v 1 transpose lambda 2 v 2 and then we can again bring lambda 2 to the left so lambda 2 v 1 transpose v 2 so this means v 1 transpose v 2 is a scalar which multiplied by either lambda 1 or lambda 2 which are different produces the same number and that can only happen when it is 0 right so then v 1 transpose v 2 is 0. so first point for a symmetric matrix if there are two eigen vectors corresponding to two different eigen values then they are orthogonal fine next is the spectral theorem for symmetric matrices if m is an n cross n real symmetric matrix then m has n orthogonal eigen vectors okay if there are n orthogonal vectors then they form an orthonormal basis right so orthogonal orthogonal eigen vectors of each of norm 1 means that's an orthonormal basis and n real eigenvalues okay now n real eigenvalues may not be all different so there can be eigenvalues repeating corresponding to different eigenvectors so in real eigenvalues counted with multiplicity and yes i mean can we prove the orthogonality if the eigenvalues are same what what do you want to prove i mean you have proved that if eigen values are different then they are orthogonal right uh i mean can you is there any way that we can prove that if the eigen values are same also that they can be orthogonal they can be but if eigen values are same you may be actually dealing with the same vector right i mean there can be eigen vectors having same so eigen value but they are essentially saying is if the eigen values are same the vectors are not i vectors are either like you know their angle is 0 so they are the same line or they are orthogonal that's what you are you are saying right sorry sir come again so in the previous one what you are what you are trying to say see if the if the eigen values are the same then you may be dealing with the same eigen vector okay right so maybe what you are trying to say is yes eigen values are same but the vectors are different then let's prove that they are orthogonal so but but vectors are different here means that uh vectors actually are on different lines i mean they can be one can be the negative of another and all those things can happen right so ah basically what you mean is they are on the different lines right so eigen vectors are also eigen vectors module or constant factor right so yeah maybe that's what you are looking to prove okay but earlier let's go on uh we will we will see if that is necessary ah but essentially here what you have is ah here what it says is that if even if there are so there are actually n different orthogonal eigen vectors that's what it says so this particular theorem spectral theorem so from here what you can say is yeah even if some eigenvalue is repeating there of corresponding eigenvectors will still be orthogonal ok so here if the eigenvalues are lambda 1 less than equal to lambda 2 and so on so in in contrary to what we are used to uh we are used to looking at the eigenvalues from the largest to the smallest or singular values from the largest to the smallest in case of pca but here we will be interested in the eigen value smallest to largest okay so we will typically when we talk about the first one we will say the smallest one we will see the reasons for that so so here the point is there are m orthogonal eigen vectors and n real eigen values ok if it is a symmetric real symmetric matrix so how so the brief proof will go like this um by the fundamental theorem of algebra we know that the eigen eigen values are essentially roots of the characteristic polynomial right so the characteristic polynomial m has at least one complex root lambda and that will be an eigenvalue of m and an eigenvector v right so so suppose it is complex so first of all we will try to show that it's not ah the eigenvalues are not complex or that basically they are real for a real symmetric matrix so note that lambda v conjugate transpose v can be written as now you can take lambda here so we v conjugate transpose lambda v and lambda v is mv so we conjugate transpose mp then this v conjugate transpose m this can be written as m transpose v conjugate whole transpose v now m transpose is m and m transpose is m as well as m transpose uh is actually m bar m conjugate so a real symmetric matrix is also called it's a special case of a hermitian matrix where an hermitian matrix is a complex matrix where a is hermitian if a is equal to um a transpose bar right so a a bar transpose rather so here you can write it as m bar v bar whole transpose v and this m bar v bar is same as m v whole bar transpose v now m v is lambda v so lambda v whole bar transpose v lambda v is same as lambda bar v bar now we are not assuming real case any anywhere yet we are still in the complex space ah v and then lambda bar you can take out so lambda bar v bar transpose v and then what we have is again similarly like just like before lambda v bar transpose b is same as lambda bar v bar transpose v so since uh v is a non-zero vector right so v bar transpose b is an eigen vector right it's a norm one eigen vector so it's it's the v bar transpose v is not zero so lambda must be lambda bar so lambda is real since lambda is real v cannot be we cannot have complex coordinates because m is real so lambda v equal to m v m v is real lambda b also has to be real okay so we cannot have ah complex complex coordinates so then what we have is the eigen values are real that is the first thing we showed now the rest the proof goes on in the following way if v1 is a real eigenvector with real eigenvalue lambda 1 then what people do is they take the orthogonal complement so they take the span of b1 and the orthogonal complement of that span of v1 and that's actually a that's another subspace of rn which is one dimensional lesser right so and then apply the same argument same argument to find ah eigen values and eigen vectors of k then take out lambda 2 and so on ok so this is basically basically ah this is how they show that the there are n orthogonal eigen vectors and there are n real eigenvalues all are real essentially ok so this is called the spectrum spectral theorem for symmetric matrices um it's actually applicable for uh yeah basically there is a there is a there is a general version available for hermitian matrices in general but we are only interested in real symmetric matrices so next is eigen decomposition this also something everyone knows if m is n cross n symmetric matrix with eigenvectors we want to vn now we are totally assuming that things are real and the corresponding eigenvalues are lambda 1 to lambda n then m can be factorized as m equal to v capital lambda v transpose where this v is just the eigenvector stacked up eigenvectors stacked as columns and lambda is the diagonal matrix where the eigenvalues are also stacked basically the diagonal entries now since v is orthogonal matrix v transpose is same as v inverse so you can sometimes we need to use v lambda v transpose sometimes we need to use v lambda v inverse okay so this is the eigen decomposition of a real symmetric matrix so a proof is easy so stacking all the eigen vectors into a matrix we can write m b is equal to d lambda is just you know stacking up m every vector v with every vector times the corresponding eigen value lambda and then since v is orthonormal ah so sorry in these orthogonal matrix i should change this orthogonal v inverse is same as v transpose so we can add m v m equal to m b v inverse and then m v v inverse can be written as this m v is just v lambda so b lambda v inverse and this is same as v lambda v transpose okay so this is the eigen decomposition now note that the eigen decomposition can also be perceived in the following way that you are actually for every eigenvector v i you are multiplying v i v i transpose so not v i transpose v i so v i v i transpose is actually the n cross n matrix and scaling that with the factor lambda i and this you are summing for the whole ah i mean for i equal to 1 to n so it is similar to the thing we saw for singular value decomposition also ah each of these will give you an n crossing matrix here and the first of that is the first corresponding to the first i mean the first first if you if you simply write out the sum as lambda 1 v 1 v 1 transpose plus lambda 2 v 2 v 2 transpose and so on what you will see is you will see n uh different terms in the sum and each term will correspond to correspond to one eigen value and one eigenvector ok so up to this i think these are the things which people pretty much know now we will start to look at graphs so suppose you know we are we are looking at one such graph is something like this is a small social network kind of a graph where there are communities you can clearly see that this looks like one community one to five and seven six seven eight is another community just for the convenience of writing things i have named the vertices one to five in the left part and six to eight in the right part but you can always so in a real data set you are not going to have that but you can always reorder the vertices and your matrices will also be reordered so um so basically the point is it is not a problem if you name them according to your convenience okay so adjacency matrix uh you all know so let g be a graph weighted undirected graph so you can for simplicity not take the weighted part you can just say undirected unweighted graph but let's take weighted undirected graph everything will follow so weighted undirected graph with n vertices and the weights are from a function from h to positive real numbers why positive because if some edge has weight 0 then we'll just not have that edge at all ok so just consider the weight is that edge is not there so will not have any age weight zero so the weights are all positive real numbers then the adjacency matrix is simply an weighted graph defined in the following way that the ijth entry is the weight of the ijth edge if the edge is there and otherwise 0 right so for this particular graph you are going to have let's say since there is no loop all the diagonal entries will be zero the the weights are not written here but i have actually drawn uh keeping in mind uh the you see the edges have different thicknesses so this is the reference thickness for you this is point five this thickness this is i think two ah this two to five the thickness is i think three and so on so basically you know you see it's a weighted weighted undirected graph and it is a symmetric thing because uh 1 2 and 3 is same as the 2 1 entry and so on okay everything is clear so far i think you have also seen undirected adjacency matrix before so that's also not new now uh so next one is okay forget the this python code here it will be necessary later the next one is called the degree matrix the degree matrix d of g is the diagonal matrix defined as simply every diagonal entry is the sum of the weights of that row or column whatever way you take it it's a symmetric thing so how does that work then ah well this is our adjacency matrix right so since this is the adjacency matrix the sum here is 3 so this the degree is 3 now we are used to our concept of degree is kind of tied to the fact that how many edges are there from that vertex here since we are considering weighted graph we will say the sum of the weights of the edges is the degree okay so instead of taking it as a boolean 1 plus 1 and the degree will be 2 then we will take 2 plus 1 is the weight okay so essentially from this if this matrix is a this python code will give you the matrix d fine so this is so fast very simple right okay now then the laplacian matrix of the graph is defined by l equal to d minus a that means the degree matrix minus the adjacency matrix okay so how does that look like then well the diagonal entries were there for the degree matrix it is a diagonal matrix and the adjacency matrix interestingly had 0 in the diagonal okay considering that there is no self loop so now what you are going to get is ah i am sorry so here i think we should have had let me correct it just a moment let me correct a couple of entries yes yeah so ah so then what happens is your and and note that in this particular case when there is no self loop the diagonal entries here are positive and diagonal entries here are zero so diagonal entries just remain as it is and the non-diagonal entries are only taken from the adjacency matrix but the negative of that so this is your laplacian matrix okay so now this is the matrix which will be our main interest for whatever to follow so let us look at that now one definition of laplacian matrix is l g or l equal to d minus a let's have another kind of view at the laplacian matrix in a slightly different way so let's say instead of trying to look at the whole graph if we simply draw a very simple graph with two vertices connected by an edge of weight one so something like this okay one and two two vertices connected by an age of weight one then the laplacian of this will be simply this matrix right because their degrees are one both having degree one and ah minus the 1 2 and 2 1 entries are 1 in the adjacency matrix so you will get a minus right so this is the laplacian of this graph g 1 2 let us call it l g 1 2 that is this matrix now note that for any vector x equal to x 1 x 2 what we have is this form x transpose l g 1 to x if we simply calculate it out so let's do the lg 1x lg x part so this x1 x2 so this is x1 minus x2 x2 minus x1 and then multiply that with xox transpose then what we get is x 1 minus x 2 whole square okay so for any vector x x transpose l g x is actually a square which means it is non negative right so for any vector x whatever the vector is this form is non negative and do you know then what will this matrix be called positive right it's positive semi-definite because it is non-negative i mean it may not be positive so it is positive semi-definite right okay so this means the laplacian matrix a small laplacian matrix the or rather the the the smallest laplacian matrix ah meaningful one is actually positive same definition now let us actually extend this to larger laplacians now let us say instead of just a graph with two vertices and one edge between them now let us consider a graph with n vertices but we do not have so many edges anymore so suppose there is exactly one edge of weight one between the nodes u and v ok so then the laplacian l g u v of this graph will only have four non-zero entries see no other vertex has any edge only u and v have an edge between them so then all the diagonals for all other rows or columns will be all other rows or columns will be 0 because the degrees are 0 okay so only we will have entries in case of u and v right so we will suppose in this kind of a graph if we if we forget all other edges only there is an edge between three and five okay so if you forget all everything else but the nodes are there but all the other edges are gone if there is only an edge between three and five then the laplacian l g three five will basically be one one in the third and fifth and minus one minus one here all right now the most important part then for a general weighted graph now let's fall back to our original graph like this we can write the laplacian of this graph as for every edge the weight of that edge times this particular laplacian this particular corresponding laplacian if we do that we are going to get this now probably intuitively it's kind of you are getting it but let's just verify that in terms of one or two entries so let's say one this this entry one so this node node or vertex one right label one it has ages two two and three so it has ages two two and three the weight to two is two and weight to 3 is 1 so that's why normally you should have a 1 here one here my sorry again i have made a mistake i correct it later so minus 2 minus 1 here by the way ah it's a copy based thing so minus 2 minus 1 again minus 2 minus 1 right now what are the l what is l g 1 2 l g 1 2 is simply 1 1 minus 1 minus 1 since the weight of w 1 2 is 2 then what you are going to get is 2 2 minus 2 minus 2 now in this sum this minus two minus two will not be touched by any other edge because only one two where only one two or two one will only get for from this edge but the diagonals will still be touched by other edges right so for 1 3 the weight is 1 so for 1 3 you are going to have the matrix 1 1 minus 1 minus 1 so this one and the previous 2 will make it a 3 ok so you see that it's actually if you take this lg for a particular edge a matrix of this form and if you simply overlap and keep adding these matrices with the weight factor for all ages you are going to get the original laplacian graph here a laplacian matrix here is this clear to everyone yes sir okay fantastic so now this formulation will actually be very important for whatever you are going to see so so now what we can ah say is that the eigen values of the laplacian are non-negative we already know that for symmetric matrices the eigen values are real right but what we now will find is the eigen values of the laplacian are non non-negative how ok so first of all to do that ah let us look at this statement so if l is the laplacian of a symmetric graph then x transpose l x is greater than equal to 0 for all x okay all x n dimensional i mean there are n dimensions here g has n nodes so the proof is we already know that the laplacian sometimes i am writing l g and sometimes i am simply writing l because the context is clear we are we are only talking about lg here so we know that l is the sum over all edges w i j okay if i j is the edge w i j l g i j right now for every age for every age i j we have seen from this formulation l g 1 2 right that x transpose l g i j x is x i minus x j whole square let's pause what is x i and x j ok for any vector x which is n dimensional vector right x is the i coordinate ok in some way it will correspond to our ith node or ith vertex ok so here for every edge i j x i minus x j sorry x transpose l g one i j x is same as x i minus x j whole square where x i and x j are the i thin jth coordinates of x and this is non-negative so for every edge this particular thing is non-negative so if we combine this and this right so this is a sum over all edges and this is for every element of that sum so every element of the sum is actually non-negative so what we have is then we multiply that with weights which are also non negative weights are actually positive so weights are also positive del x is ah this whole thing and that is actually non negative ok so now we have shown that x transpose lx is non-negative for all then we can come to the theorem that eigen values of l are non-negative how because if we use the lemma for the vector so let us say v is an eigen vector let us put that in in place of x so let us think that v is our vector we are looking at then what we have is v transpose l v is greater than equal to zero right so just because x transpose l x is greater than equal to zero v transpose l v is greater than equal to zero but lv is lambda v right because it's an eigen vector so v transpose lambda v that is lambda v transpose v now v transpose v is if the eigen vectors are normalized it is one otherwise it's a positive quantity anyway right so v transpose v is a positive quantity then lambda is also uh i mean here we have assumed that it is a normalized eigenvector usually that is what we do so then it is lambda that is greater than equal to zero so the eigen values of the laplacian are non negative fine so for okay so next is yes they are all non-negative but there is one eigen value which is actually zero at least one eigen value which is actually zero so zero is the smallest eigen value of the laplacian well that is pretty easy so 0 is the smallest eigen value of l and the corresponding eigen vector is the all one vector if we do not normalize if we do don't normalize uh then it's all one vector one in all coordinates or the constant vector ah i mean you simply normalize by ah root over n right so it's uh one by root over n one by root over n and blah blah blah ok so so and that's easy to see because that comes from the definition of the laplacian so note that the sum of every row of l is actually 0 why the degree is the sum of the weight of a row so d minus a so every row degree minus weights by definition so sum of every row is actually zero just you know verify for from this example three minus two minus one right seven this should have been a minus so minus 2 and minus 2 and minus 3 okay so basically and so for maybe for this one 6 is the diagonal minus 3 minus 2 minus one right so there will be exactly a will have the total ah contribution of the adjacency matrix in a row will be exactly the negative of the contribution of the degree matrix so then obviously if we simply multiply this matrix with the all one vector here okay all one vector here all we are going to get is we are going to get a 3 and then minus 1 minus 2 so minus 2 minus 1 right so basically all basically multi summing the row summing the rows that's what we are doing so we are going to get the zero so this uh since no eigenvalue can be negative zero is actually the smallest so then what we have is the constant vector if you want to normalize ah that is 1 by square root of n all the way is the one eigenvector and 0 is the corresponding eigen value and that is the smallest eigen value of the laplacian fine ok so now our main point of interest will be fine we are happy that zero is the smallest eigen value what is the second smallest eigen value and the second smallest eigen value will be our main topic of interest for this graph partitioning thing the second value so so first of all let us notice that the second smallest eigen value of l is positive if and only if the graph is connected if the graph is not connected then we will have another eigen value 0 at least however if the graph is connected then there will be only one eigen value 0 that means 0 will have multiplicity 1 only so let us do it if and only if right so let us first assume that g is not connected and then we'll show the we will show a contradiction and then we will do the other side also so let's say let g be disconnected then there will be at least two disconnected components g1 and g2 which have no edge between them then you can actually reorder the vertices and write the laplacian as a block matrix right so this portion will be totally separate from this portion no edge from this to this or this to this right so we can actually write because g1 and g2 are totally disconnected and then just like this block division you can do a block division of the vectors so this means one for the number of rows you have g one and then zero for the number of rows you have g two or in this case zero for this whole part and one for this whole part okay so then which has this basically has two eigenvectors and both both with eigenvalue zero it is just a you know block multiplication generalization of what we have seen in the previous slide so then ah if g is disconnected then the second smallest eigen value by second smallest we do not mean strictly second we mean if lambda 1 so we had that ordering right lambda 1 less than equal to lambda 2 less than equal to lambda 3 and so on so lambda 1 is 0 lambda 2 will also be 0 if g is disconnected so that in this case eigen vectors are partitioned in the same way as the matrix l g so this partition exactly corresponds to this partition okay the blocks now let g be connected okay if g is connected it's an interesting thing and then let's assume that we have another eigenvector with eigenvalue 0 so our now our claim is if g is connected then the second smallest eigen value is going to be positive that means we cannot have another zero eigen value with us with a another different eigenvector right so essentially this means that if we assume x is an eigenvector of l g with eigenvalue 0 we have to show that x is our old eigenvector only it cannot be a new eigenvector a new eigenvector has to be orthogonal to the old one right so now what we have then is 0 is same as lx because we have assumed that x is an eigenvector of l with eigenvalue 0 0 is lx and since this is 0 i can simply write x transpose l l x and this part is this right we know that this part is this well you can actually put a w i j here wait doesn't matter same thing will follow now however this portion is a sum of squares now if this has to be 0 then all the squares have to be 0 that means for each edge i j x i has to be x j x is our vector where for let's take one edge let's take the first stage ok whatever the first stage is maybe one and two right so let's then that means x one equal to x two so for each i j the coordinates x i equal to x j let's then it follows for the first stage uh if there if that edge is one two then x one equal to x two then go to the next stage since g is connected we can actually have a path from every node to every other node then this way we can just say okay first one is equal to the second one equal to the third one equal to the fourth one in some order we will actually cover whole of x right so this means we are simply back to the old eigenvector the constant vector whatever x is of whatever x i or x j is it has to be the same it has to be the same constant vector so that means all coordinates must be the same that means if we have another ah vector which is an eigenvector of l with eigenvalue 0 it has to be the same vector and not a new one so then we have proved that the second smallest second smallest eigen value is positive if and only if g is connected this is the most important thing about today so i hope you have all got it now how do we use this to partition the graphs first of all we should only deal with connected graphs i mean if if our given graph is disconnected already we can treat them as separate connected graphs right i mean we we want to partition graphs if the graph is already partitioned we don't i mean we don't have to do anything there so let's focus on the connected graphs only the goal is we should partition the vertex set of v into these joint sets of vertices s and t such that the cut you you should know this from in some of the previous courses in your first year the cut is small i mean the definition of cut and so on but ah we don't really have to go that much into it if you have forgotten no issues the cut that means the edges between s and t is small okay but let us say your graph looks something like this this is one natural community kind of a thing and this is another natural community here we have three edges between them and there is one outlier vertex which has one edge to this now this is a whole connected graph right what would be the minimum cut here which will partition the graph into two non empty sub graphs what will be the minimum cut the minimum part will be something like this right yeah it will be something like this but that's not interesting i mean you are simply there was one poor outlier somewhere you are simply throwing him or her out what you wanted to do was this right i mean you wanted to divide the graph into these two partitions so the minimum cut may not be the best one we want right so what do we want then we want to put another condition that the two partitions should have approximately equal number of nodes exactly equal may not be possible or may not even be the real case should be kind of you know they should have approximately equal they should be approximately equal in size because that is what we want now there are derivations of exactly what we mean by normalized cut and all those things we don't actually have to go into that if you want you can actually read the mmds book that chapter i have referred already here by the way there are a lot of references at the end of this lecture i'll i'll talk about those things shortly in five minutes but basically the idea is that we want the cut to be such that yeah there should not be so many i mean there should be there should not be too many cross community edges or cross partition edges but they should also be kind of comparable in size it should not be like a one should one is a very small another is a very big one so then what we will do is we will try to find the second smallest eigenvalue so we showed that the second smallest eigenvalue is positive if and only if the graph is connected but the way we try to find the second smallest eigen value will actually also give us a partition of the graph before that we should refer to one important theorem now it is not possible to prove it here it is a very long one and still actually applicable for general ah complex hermitian matrices ah the current fischer theorem the simplified version if i simplify it for our case the version would read like the following that the second smallest eigen value is the minimum of x transpose l x for any x with the condition that it has norm 1 ok and i mean this you can also it may be misleading to write l 2 norm here because we are talking about graphs but you can also write just some of the sum of the squares of the coordinates equal to one and the it is orthogonal to the previous eigen vector the previous eigen vector was on this line right one one one all one so it is orthogonal to that so let me just tell you what the original current uh min max principle or current fischer min max theorem and whatever it is what that reads is it actually gives you a general guideline or general formula for finding eigenvectors one by one right given nothing you find one eigenvector then given one eigenvector you find the minimum of this and subject to the condition that it is x is orthogonal to all previous eigenvectors so that's the general one in our case obviously we have only one previous eigenvector so that's all we have here this vector one so all we need is we want a normalization condition because otherwise i can just have any scaling factor here just to ah put that here and we want a minimum of we want to minimize this quantity and for the uh the vector which minimizes this quantity is actually our second ah actually the eigenvector corresponding to our second smallest eigen value okay so that is the quran fischer principle now how do we ah so the it straight forward to see the other side of it so what this states is if you find the minimum of this subject to these conditions then you will get the eigenvector the other side is obviously straightforward that if it is the eigenvector then you are actually going to get this orthogonal and this also will actually be equal to the eigenvalue right so those are easy to see now what do we achieve if we solve this constraint optimization problem of finding the second eigenvector so let us look at that how does it intuitively correspond to our graph how does this quantity intuitively correspond to our graph how and how does it actually partition the graph so we want to minimize x transpose l x and we know that again i have i have omitted the weight weight term here w i j but you can actually put the weight term here it will not be a problem so we want to minimize this which is same as the sum over x i j sum over i j x i minus x j whole square right and do you have two conditions one is the sum of the squares of the coordinate has to be 1 and the second is it has to be orthogonal to the all one vector now this condition essentially means what happens when you do x transpose all one you simply sum the entries of x right x transpose all one you simply sum the entries of x and the sum of the entries of x has to be 0 that means you must have a mix of positive and negative entries now when x is a vector of n dimensions n is your number of vertices some of the coordinates are positive some of the coordinates of x are negative then that gives you a partition of the graph into two partitions one is all the entries which are positive all the entries of x which are positive another is all the entries of x which are negative is this part clear that if you find a vector x where some of them are positive some of them are negative then the corresponding nodes the positive nodes can be one partition and negative nodes can be another partition so that's where we are actually connecting our concept of partitioning to this vector x so we are actually directly the eigen vector will directly give us a partition x will be our eigen eigenvector and what about this portion what does this mean this means that you want to minimize the sum of squares we want to minimize ah a sum of many squares now none of them can be negative they're all squares so they can be something big positive or something small positive or zero so then if you want to minimize it will be good if for the k and we only sum over edges right we don't care whether where i j is not an h we don't have those terms if i j is not an edge then we do not have those terms so if you look at this graph i do not have a term 3 7 here x 3 minus x 7 whole square no i don't have because the edge is not there so i only have terms like x 1 minus x 3 whole square i have x 7 minus x 8 whole square and so on and i want to minimize this thing so this means whenever i have an edge it will be good if x i and x j are same or similar if they are same or similar x i minus x j will be small small in magnitude then their square will be small right so that means our so essentially what this means is that when both are positive x i and x j both are positive we are putting them into one partition they should be like same kind of numbers when they are negative also they should be same kind of numbers which means the number of edges across partition would be as small as possible now obviously you can think well why couldn't we simply take all these into one partition all these into another partition so simply make like all all only edges between positives and only edges sometimes it may not be possible because you ultimately also have to make sure that it has to be orthogonal to your previous eigenvector that means the sum has to be 0 right so that may not be always possible but yeah in some cases it may be so essentially we are doing two things one is we have a mix of positive and negative that means two partitions another is we are minimizing the cross partition edges as much as possible so if we actually do that if we if we find the second smallest eigen value of this laplacian we we know the laplacian we have done this before right then we will figure that it is actually 0.53 the smallest was 0 the second point list is 0.53 and our eigenvector corresponding eigenvector will actually be beautiful it will be the first five entries will be negative the next three will be positive that means it will actually give you a nice partitioning the first five will all be negative and the next three will all be positive well it's because it's a nicely drawn graph right i mean you you we can actually part of the graph a bit more and we can take larger graphs then will not be so nice anymore but still will be pretty good now some more intuition and some more insights here see that the number for 8 is two five the number for seven and six are same point five six point five three if you look at the graph the role of seven and six are same both are just connected to eight and both are connected to each other yeah so they get the same number eight is so these are these are hype higher positive numbers uh and this is a slightly lower positive numbers which means that this was closer to being in the other partition well it's still in this partition but well this was closer to zero similarly for this partition also these are like somewhat more negative and this one is almost positive like minus zero point one one zero this is again closer to being in this partition right so you see that not only that it's a positive negative partitioning you also get some idea about how the nodes are close to the other community and so on okay all right so uh now obviously we will actually look into bigger matrices and we'll play around in the hands on in the next class but now so this means now what we have seen is we have seen an approach to partition a graph into two partitions and the partitioning is going to be nice because it will be like dividing the graph into communities but just two partitions i mean why so special so we need to extend this approach so one approach could be hierarchical clustering you cluster g into g one and g two then in iteratively apply the same approach on g1 and g2 separately so you again again cluster g2 g1 and g2 separately so you get a hierarchy of clustering right so two partitions and then each of them you can again divide into two partitions and so on and you can go on until some kind of a threshold and stuff like that so that's one kind of approach ah the other kind of approach is partition into more than two clusters directly i'll just give you the outline of this we will see this in more detail in the next class so for that what we do is we try to find something called an eigen gap i will explain it in the next class if such a gap exists and use the first few eigenvectors not from the largest eigen value but the smallest eigen value use the first few eigenvectors and apply dimension reduction in a little bit different way and then maybe use some k means kind of algorithm in the reduced dimensional space okay so this will do in the next class ah so yeah so that's pretty much it what i wanted to talk about any immediate question as of now okay so looks like for most people it is it is pretty good now then let me stop the recording
Info
Channel: Debapriyo Majumdar
Views: 173
Rating: 5 out of 5
Keywords:
Id: Er0DjMZ5kR0
Channel Id: undefined
Length: 51min 26sec (3086 seconds)
Published: Tue Nov 03 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.