Random Matrices: Theory and Practice - Lecture 1

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
that's it okay thanks Mattel for your kind words I'm honored and and humbled by this prestigious invitation to come and give a series of lectures on random matrix theory which is the field I entered about twelve years ago and which still surprises and and amuses me I have prepared a couple of handouts there are some spare copies here one is on let's say the history of the of the subject there are some original papers and some curiosities so and the other one is about numerical numerical simulations so I hope that together they will help us form no round view of of the of the field from from multiple and different perspectives I in preparing these lectures I assume that while most of you may have heard about random matrix Theory probably none of you has have none of you has received like a formal training on it and you haven't worked on on research problems in in there so if this assumption is not turns out not to be correct then we can we can tweak the pitch a bit so in summary although I my aim is to cover pretty advanced material in the end I will try to start as mu 3 and softly as as possible in in essence random matrix Theory is the happy marriage between linear algebra and probability probability theory and the fundamental so let me so random matrix Theory commonly denoted RMT what we can say is the some of linear algebra and probability theory in in essence the the main problem of random matrix theory can be formulated in extremely simple terms if I give you let's say a matrix X [Music] [Applause] which is of size n by n this matrix is described by a certain joint probability density for the entries so we have a matrix with with random entries described by a certain joint probability density and out of this input we would like to say something about the eigen values okay so this is the the general setting of random matrix theory I give you a matrix with random entries and I want to say something about the eigenvalues of this of this matrix good so of course I could in principle spend a lot of time trying to convince you that this type of setting is very useful it it comes up naturally in many in many different fields but actually I'm decided to follow another another route so experience has taught us that sooner or later everyone comes across a random random matrix okay so I will let time work for for me and I will just focus on giving you some some tools so no phrase account on how to do calculations on random matrices okay I will skip all the motivation part because it is boring and because you will learn yourself that random matrices are useful without me telling you this good so the first calculation that we can do the simplest and most basic calculation which is at the same time very instructive has to do with 2 by 2 so 2 by 2 random matrices let's call it spacing distribution okay so the the question is is very simple we have a 2 by 2 matrix X 1 X 2 X 3 X 3 so this is a real symmetric matrix these two elements are identical and we take X 1 and X 2 as Gaussian random variables with mean 0 and variance 1 okay so they are taken from the probability density function 1 over root 2 pi exponential minus 1 of X square and we take the off diagonal so these these elements are all independent these three elements and the off diagonal element we take it as Gaussian with mean 0 and variance 1/2 so it is taken the off diagonal element is taken from a PDF of this type now you might already start asking okay why are we picking the the variance of the off-diagonal element equal to one-half the variance of the diagonal elements there's a quite deep reason behind it and I will try to explain why we we do this but other than that other than this simple difference the setting is pretty pretty simple okay now what is the what is the question I want to compute the probability density function of the spacing the spacing s between the two eigenvalues so if this one is the largest eigenvalue and this one is the smallest eigenvalue i take the difference between the two and i want to compute the probability density function of this object clearly this this object s is a random variable because the eigenvalues are random variables since the entries of your random h of your matrix are around them so you have the two eigenvalues they are real because the matrix is real symmetric and I want to compute the probability density function of this object excellent so well if we don't have many more sophisticated tools the only thing we can do is to compute the characteristic equation compute the characteristic function compute the eigenvalues in terms of the or the entries compute the spacing and then try to work out the probability density function can erase here good so the characteristic equation for a 2x2 matrix as a particularly simple form so it is a second degree equation on the square minus the trace of x times lambda plus the determinant of of X so in this in this case the trace of X is just X 1 plus X 2 and the determinant of X is X 1 X 2 minus X 3 square from which you can compute the two eigenvalues which are 1/2 X 1 plus X 2 plus minus root of X 1 plus X 2 square minus 4 X 1 X 2 minus X straight square so from from this expression we can compute lambda 2 minus lambda 1 so the spacing between the two so the spacing between the two eigenvalues is lambda 2 minus lambda 1 it will be just given by the square root which we can simplify to be X 1 minus X 2 square plus 4 X 3 squared okay so from from this point onwards we can forget that this object came from a random matrix problem okay so in essence what we have is a function of three random variables x1 x2 and x3 and we want to compute the probability density function of this object given the probability density function of the original variables we can completely forget that this object came from a random matrix problem now how do we compute the PDF of this object well we we just apply basic properties of functions of random variables so what we have is that the PDF of s will be given by what it will be given by the integral over the PDF of the three variables so this one as root 2 pi this one had root 2 pi this one and then we have exponential minus 1/2 x 1 square okay so the definition of s is root of X 1 minus X 2 square plus 4 X 3 square so this this expression gives the probability density function of a function of three random variables given the Joint Distribution of these three variables so if we are able to perform this integration we will have our answer now this integration can be if you notice this structure we have a square sum of two squares here so the natural thing to do is to go to polar coordinates in a certain combination of X 1 X 2 and and X 3 so the the right change of variable is probably X 1 minus X 2 will be R cos theta we will have 2 X 3 should be R sin theta and then we can use X 1 plus X 2 equal to psi this will help us in the in the calculation okay so we make this this change of variable and this is integral so this change of variable gives rise to this inverse change of variable so now we all know that we need to compute the Jacobian of this change of variable and express the integrand in terms of the new of the new variables if we do that the Jacobian is given by the partial derivative of x1 with respect to our partial derivative of x1 respect to theta partial derivative of X 1 with respect to psi and the same thing for other variables so you can you can work this out I just came give you the answer this cos theta over 2 minus power of sin theta 1/2 minus cos theta over 2 R theta over 2 1 half sin theta over 2 R cos theta over 2 and 0 you can check it easily by taking the partial derivatives of the of this scheme here and then all you all you have to do is to compute the determinant of this of this object so if you compute the determinant of the Jacobian you get that the determinant of j is minus r over 4 so all the angular dependence drops out and we will have to take of course the absolute value of this of this object okay so if we do that this is just simple algebra you need to take derivatives of this object and compute a 3 by 3 determinant so will not spend more time with ok so what we have here is that P of s is equal to what is equal to 1 over 2 pi root 2 pi times root 2 pi and here we have a root pi then we have a factor 1/4 coming from the from the Jacobian so we put in here a factor 1 over 4 and then we have the integral over R which is the radial variable the integral over theta which goes from 0 to 2 pi and the integral over upside which remember was the sum of x1 and x2 and so the range of this object is minus infinity to plus infinity then what we have here is the absolute value of the Jacobian which is R because 1/4 has been already taken care of so we we stick in here an extra R then we have a delta function here of s minus this combination here is root of R square so it is just R and that's where the simplification comes comes from and then we we need to rewrite this object in terms of the new integration variables so if we do that you get exponential of minus 1/2 so we have X 1 square which is R cos theta plus PI over 2 square plus sy minus R cos theta over 2 square and then we have we've taken out a factor 1/2 so we add 2 times R squared over 4 sine square of so this one is X 1 square X 2 square and this one is X 3 square but there was a factor 1 out of this thing so I'm correcting for it so now this is the integral we we have to solve we are of course helped a lot by this Delta function so using this function we can kill one of the integrals the R integral and exchange every occurrence of R with an occurrence of s so if we do this you would perform this calculation yes no because you need to take the absolute value of the of the Dzeko of the determinant of the Jacobian matrix right because the change of map the change of measure between two integration region cannot cannot have a negative signature right because otherwise you're mapping positive regions into negative regions so the determinant is minus r over 4 over 4 but you need to take the absolute value of it ok so from here we have equal 1 over 2 pi root pi times 1 over 4 and then we use this Delta to kill the R integral so we get a factor of s here but we need to be careful because the integration over R is between 0 and infinity ok so when we use the Delta function to kill this integral we also need to impose that ah s should be between 0 and infinity so as should be positive this is clear because s physically is the difference between the largest and the smallest eigen values so if it has to be positive so we can put a theta function here which means that s must be positive so net then we kill the theta sorry we killed the our integral and we are left with two integrations of this type exponential minus one hulls and then I just write s square cos square theta plus sine square plus 2 s sy cos theta over four this is the first the first bit here where I replace s here and expand the square plus sy square plus s square cos square theta minus 2 s psi cos theta over 4 plus s square over 2 sine square theta okay so I'm just expanding the squares there and replacing every occurrence of our with an occurrence of s and and then it's just a matter of algebra to just sum these bits up for example you see that there is a cancellation like this guy here and this guy here cancel cancel out and then we have yes so sorry any question can I raise you okay so after a bit of algebra I just we get 1 over 2 pi root pi 1 over 4 then we have s theta of s then we have an exponential of minus s square over 4 here and this this comes from from these bits so we have s square cos square theta plus s square cos square theta so this is a 2's square cos square theta divided by 4 so it is 1 alpha which combines exactly with this object so cos square theta plus sine squared theta is equal to 1 and the theta dependence drops out note that this can only happen because we took the variance of the off diagonal elements to be 1/2 of the variance of the diagonal elements so ad we take in the variance equal between the two then the theta dependence wouldn't have dropped out and we will get still an integral over theta to perform which would give rise to a Bessel function so only because of this factor of 2 we can simplify the theta dependence okay but this is not the end of the story so this factor of 2 has much deeper consequences ok we will we will see later ok so then we have the theta dependence becomes trivial ok so this is just a factor of 2 pi and then the side dependence is also easy so we have exponential of minus sine square over 4 which is just a Gaussian integral giving rise to root 4 pi so the integral is it's done and if you absorb all the constant we get that P of S is for basically as positive as over 2 exponential of minus s square over 4 so this is the final the final result this is the probability density function for the spacing of the 2x2 Gaussian random random matrix okay so let's let's plot it so you see that for s going to 0 this object goes to 0 linearly so it goes like this goes up and then the exponential decay wins and so it goes to zero so this is the shape of the probability distribution probability density function for the eigen values over two by two yeah so if you if you pick if you pick the variance of all the elements like different from from from each other then you wouldn't get this this simple simple form okay but there would be a way to rescale the final result in such a way that that all the results would would fall on top of the same Universal curve okay so that's yeah that's that's a bit a bit trickier you're talking about the mean right the mean value yeah so the mean value is not that important the variance is this crucial yes yes you can you can you can do it normally you don't you don't do it just because it makes the calculation more difficult without adding anything that the variance is is crucial so the choice of the variances is more crucial now this this curve that that we have just computed is so important in matrix theory that it has a name so it is named after one person who contributed a lot on on the field of random matrices so this is called a vignale surmise bigness surmise okay and now here it comes the first well you can find you can find something on the beginners of mice in the in the handbook in the first pages of the of the handbook of the [Music] four years a paper by Vigna on page 2 of the n book where he basically completed did exactly the same calculation that I that I was doing here this page - yeah well I mean this one was just was just toy model to its heart it's not to tell what kind of practical problems you would like to solve with the 2x2 Gaussian gaussian random matrix I chose this example just because it is very instructive and it tells you a lot about how to perform calculations and what type of information you get about eigenvalues of of larger matrices so typically when you when you cannot do a calculation for an N by n matrix the best thing you can do probably is to go to a 2x2 case try to work it out and then try to infer what would happen for larger for larger matrices so I wouldn't I wouldn't take this example as a true representative of the type of problems that we encounter every day but this this Vigna surmise tells us a lot about what happens in general that's why I chose this this example okay I wouldn't read too much into it okay so the the reason why it is called weakness or my's so this is one of the first examples where the name given to things is not really appropriate and and in random matrix theory this is this happens all the time so surmise in english surmise means to think or in fair without certain or strong evidence this is just taken from the diction without certain of or strong evidence so it is it is hard to speculate why it is called as a surmise given that it comes from an exact calculation now it is called the Vignale surmise because the story was the Vigna was attending a certain conference on neutron scattering and and people asked the question about what would be the typical spacing of resonances in Innis capturing in a neutron scattering experiment and then he just walked up to the blackboard and he wrote basically this this formula and was like I think it should be like this and then from from that moment onwards it is called the bigness of noise even though there is no surmise I mean it comes from from an exact calculation ok but you will you will read it all the time so I thought I should you know I should point this point is out now what is this picture telling us this is a very instructive point you see the really important region is is this one why because for s going to 0 the probability density function goes to 0 so it means that the probability of finding two eigenvalues that are very close to each other goes to zero it means that the two eigenvalues don't like to stay too close to two each other so one eigenvalue feels the presence of the other this is quite striking because we started off from a matrix with independent entries so the entries are independent they don't talk to each other but the eigenvalues do because the second eigenvalue doesn't like to stay too close to the first one so this is a very very important point that survives also for larger and and it is called level repulsion so the eigenvalues of random matrices generally not just for two-by-two and not just for Gaussian measure don't like to stay too close to each other they also don't like to stay to far away from from each other but this is less you know less less striking the most important thing is that they talk they talk to each other even though the entries are are independent if you want an analogy you can think of the eigenvalues as for example birds that are perching on an electric wire or parked cars on a lane you don't like to stay too close to the car ahead of you you didn't like to stay too far away or when when you when you're part of course you might think okay this is totally crazy this analogy doesn't doesn't work but if you go on page three actually people went people measured the Viggen of surmise in the position of birds that are perching on on an electric wire and in the in the distribution of spacing between parked cars and they found a very good agreement so um don't read too much into into it I think I've heard a story that the guy who did this experiment with parked cars was actually questioned by the police because he tried to measure the spacing of Park cars and were like what are you doing oh I'm trying to demonstrate the Vigna surmise and they were like yeah yeah come with us good so this is a very this is a very important point very important feature of random matrices is that eigenvalues do repel okay they they talk to each other even if the entries are are independent excellent now just two okay well just to complete the picture can I erase of course I forgot to tell you in the clearly in the in the other hand out the one on numerical simulations you can of course test this prediction very very easily so I did it like in MATLAB you can easily generate a 2 by 2 matrix diagonalize it to compute the eigenvalues and put and compute a histogram of the of the spacing and what you get is is exactly the beginner surmise shape this is on page 3 of the of the other hand out so you see that the number of events where two eigenvalues are very close to each other is is negligible this vanishingly small now - of course we we should compare actually this situation with what happens in the case of iid random variables so if you take independent may possibly identically distributed random variables which is not the case of the eigenvalues ok so the I the independence here is on the entries not on the eigenvalues but if you take truly independent random variables what would be the distribution of the spacing that's less compared to the two cases okay and we will appreciate the difference so in this situation the calculation so we can perform a calculation that is quite general so it doesn't depend on the distribution of the individual random variables and so I must say just to motivate you a bit that although I searched for a long time for this type of derivation the one that I'm going to do now I failed to find a single reference where this calculation was was performed in full details from top to bottom so so what I'm telling you is so I'm giving you a gift you will not find this derivation anywhere or probably you will and that was just stupid and I couldn't find it okay okay so just for comparison what is the the law for the spacing of iid random variables so a setting everyone knows what iid means no okay deal means independent and identically distributed so the setting is you have a set of random variables X 1 X n which are described by a joint [Music] density function which is of the form P of X 1 P let's say P X of X 1 P X over X and so the the Joint Distribution of the end random variables is equal to the product of individual distributions for each random variables which means that the random variables don't talk don't talk to each other and they individually follow the same distribution good so we also define so this the probability density function for one single random variable and associated to a probability density function we can introduce the cumulative distribution function let's call it f of X which is basically the integral up to X of the PDF so the cumulative distribution function is basically equal to the probability that one any of this random variable is takes value smaller or equal to X okay and you obtain it by integrating the probability density function for all the values up to 2x okay so we want to can I erase here so this is the setting and so in order to perform the calculation of the spacings so what I what I'm after is the following thing you have a certain domain on the on the real axis for example the full the full real axis and you start throwing darts independently okay so you then like one here one clean one here with a certain probability distribution which is the same for every toss and then after a certain number of tosses you ask what is the distribution of the nearest neighbor spacings between the toasters clearly if you think about it suppose that the PDF has this type of shape okay then this means that with high probability you will always toss your dot around this region right the region where the PDF has its highest value so it is highly likely that you will get a situation like this a situation of clustering where all the dots just end up on the same region which means that most likely the probability of reading small spacing will be high because because the the the the ptosis will cluster around the position where the PDF is is higher so this this situation is markedly different from what what we've seen in the case of random random matrices where to the two tosses would end up space the pond exactly this is just an idea we'll try to formalize it good so in order to formalize it we can define an object let's call it P n s given that X J is equal to X so this is a conditional PDF so given that one of the random variables let's say X J takes the value of x so this is a conditional probability given that one of the random bars X J takes the values X that there is another random variable X K so with K different from J at the position X plus s and no other variables in between so what I'm saying is the following we we condition the probability to the fact that in the position X you have 1 so 1 1 dart has landed at the position X already so it is there now we are given this fact what is the probability that there is another dart which has landed at the position X plus s and all the others have landed either to the left of X or to the right of X plus s so there is an empty space in between so the the claim is that so I give you the formula for this object and then we try to understand why it is like this so the formula is like this the probability density function of one single variable computed at X plus s times this object 1 plus f of X minus f of X plus s raised to the power n minus 2 so f of X is the cumulative distribution function for each individual random variables so why is this formula true we know that there is a random variable sitting at X okay so we can forget about about this one it is there now we want a spacing of size s between two random variables so we wanted another random variable has landed in the position X plus s and this happens with probability P of X of X plus s just by definition is the probability density function of any random variable to land at X plus s but this and then we want this space to be empty so we want the n minus two random variables that are remaining to be located either to the left of s of X or to the right of X plus s so the probability so f of X is the probability that any of these variables has landed to the left of X because it is the cumulative distribution function and one minus f of X plus s is 1 minus the cumulative distribution function so it is the probability that this random variable has landed to the right of X plus s and all this is true for all the remaining random variables which are n minus 2 so we need to multiply the probability together so this is d so this is the final on the final expression this is an exact expression which is valid for any PDF and any number of random variables so it isn't and it is a completely exact result ok note that this is not the object that I promised that we should we should compute because this is a conditional PDF we assume that one random variable is at attacks we want now to lift this this condition we don't want any specific random variable to sit anywhere specifically but but this expression this way of proceeding is quite clear at least to mine to my eyes good so now how do we move forward from from here well then we can we can say okay from here we can compute the probability the conditional probability of having a spacing s given that any of the X is at the position X not just the Jade random variable and this one we can compute it using the law of total probability so this is the sum over all random variable of the probability we've just computed times the probability that XJ is equal to X okay so if we want any of the random variables to sit at X naught not just the X J's then we need to sum over all the possible possible occurrences of this event for all the random variables X 1 X 2 X X n but this calculation is is simple because this probability is just the PDF so it is px of X and this summation just becomes you know we are summing identical terms so it just becomes a factor of n so what we have here is n times P and s XJ equal to X which is the object which we just computed times the PDF of a single random variable so now we are getting closer to the object we want we want to compute except that this is we are still not done because this is still a conditional probability we have one guy sitting at X which we we don't want we want to compute the histogram no matter where the initial particle is the initial seed is but so what shall we do any idea just to remove the X to remove the dependence on on X sorry yeah okay so how would you compute that okay well we are very close thanks to your inputs so we we are here now there's still a conditional probability and we have computed one of the crucial ingredients which is which is here now all we have to do is to compute this facing PDF we can just integrate this conditional probability over X because the initial point can be anywhere okay anywhere on the support of the original so Sigma is the support of your initial piece the X or X okay so in summary so this is equal to n integral the X p.m. as given that XJ is equal to x times the probability so this is the probability basically that the first the first there dart has landed at the position X so this is the completely general formula for the probability density function of the spacings between two consecutive random random variables irrespective of where the first one was the first one landed now it is it is interesting I haven't put it in the in the handout you can write like a short yeah [Music] well this is this is not wait no I don't think so right because here here you are saying that X X J so one one of the X's is staying at X but you don't you don't know which one which one this is you you're you're fixing one but you don't know which one it is so it can be number one it can be number two or it can be number n and you've got n of the n of them not not n minus 1 this one no why well you you're you're you're putting one at at X plus s no matter which one which one this is and then and then the N minus 2 I mean you can do it with with a permutation with it then you would have a sum over with a binomial coefficient but you can you can show that this sum turns out you can read some this expression to get this this one this is just a quicker a quicker way of course you can you can compute the permutation because you can add the second one to the left and the fifth one to the right or or you can change the two so you would sum over the binomial you know coefficient times blah blah but then you can read some this expression to get exactly this this binomial it's if you if you expand this binomial you will you will see okay so I haven't put it in the in the handout but actually this this final expression you can now put on on a computer for example in MATLAB you can generate random variables taken from from from a Gaussian distribution or from an exponential distribution you can compute the histogram of all the nearest neighbor spacings and then compare this object with this expression for example for exponential distribution you can really compute these objects and you can compute the interval analytically if you want okay because you just need to put this object in in here and perform just an extra next to integration and then you can you can really you can really see how the histogram and the theory would would match this is an exercise that that I suggest you need to do and if you are if you are interested we can we can sit down and do it together some in in MATLAB it's really a one-liner for for explanation distribution you sample random variables we take the dish and then histogram the diff and then this interval we can perform analytically so you will you will find perfect agreement between the theory so this is the object we we were after in the end except that the expression is not really informative it is not as evocative as as the big nurse bigness surmise so maybe we can do something better here so we can take the large and limit to find a more striking and universal result which does not depend on the PDF of each single toss in order to okay so first of all this this is a PDF so it should be normalized so I asked you as an exercise to check that if you compute this integral it is correctly normalized to one this is a PDF so it must be it must be normalized so Sigma Sigma is the is the support of the individual PDF so for example if the PDF of each single run of IVA is exponential then this integral would run from 0 to infinity if it is a Gaussian it would run from minus infinity to plus infinity if it is uniform 0 1 then it would run from 0 to 1 so the beauty of this derivation is that it does not depend in so there is no it is completely Universal in the sense that it is valid for any PDF and your regional PF of your real random variables you are not assuming that it is a Gaussian or an explanation in fact if it has if if it has a fault it is that it is too general so that's why we are we are trying to compute the large and limited to extract some Universal Universal feature so this is correct but it is correct for everything so we cannot really read off what what it looks like if we were to plot it ok we are almost there almost there and then we can make a break this is just to give you another like a reference point yes okay so just just do this exercise it is very instructive to show that that your PDF is correctly normalized the PDF of the auto spacing this is an exact result for any n and any PD PDF so it is a very strong very strong result very important so important that you don't find it anywhere excellent good so in order to extract some some information here we want to make what is called a local local change of variables so I just write it and then I explain what what I mean so I want to go from our random variable or better our variable s here to another random variable that I write s hat over N P X of X so I want to measure the spacing between two consecutive random variables in unit that are measured according to this you know unit length which is capital n times the local PDF so the PDF at the point X so why are we why are we doing this because suppose that we have a lot of random variables so we just tossed a lot of darts so clearly if we toss a lot of darts for example in this domain so if we toss two knots the situation will be like this which was three four five and then you know we chose millions of darts then as you may imagine the typical spacings between two dots will go down right so the typical spacing between two dots two tosses goes down if n increases but it also goes down if around a certain position X the probability of you know throwing a dart there is is high because in this in this type of situation then you will get many more tosses around here then you get here so it is it is natural to scale your spacing between two consecutive tosses to make it clear that these two objects the number of tosses and the local density of tosses are somehow washed out so this is the idea of measuring the spacing in some units taking out this process this peculiars effect of the fact that if you toss too many dots or if the local profile of the of the PDF is too high then you would have smaller spacings not not because of no the fact that the spacing is intrinsically smaller but just because you have too many tosses or too high a local density profile okay so you want to take out this this ingredient from from the from the problem and you can do it simply because this is just a change of variable okay so a change of viable in probability is performed like this for example we want to going back to the conditional distribution you know the first object we we computed the conditional distribution that one of the random variable XJ was at position X but this time we computed it we compute it in terms of this new random bar of this new variable sorry that's not on the bar so we just take the formula we had before and we compute it on this object so the formula we had before was P of X of X plus s and this becomes X plus s hot over and px of X times 1 plus f of X minus f of X plus s hat over N P X of X to the power n minus 2 so I just I just took the original formula that we derived and I'm computing it in this new in this new variable not s but s hat over N times the probability density the local one which depends still on X okay and then what I do is I try to compute the what happens for for large end to this to this formula so if I throw a lot of dots then well what I can do here is I can expand this object for large for large n using a Taylor expansion this will be f of X plus s hat over N px of X F prime of X plus and then you see that that we are we are on something because f of X cancels with with this guy here so this f of X and this f of X go away because there's a minus sign here and then we have one minus this object so f prime of X is it is the derivative of the cumulative distribution function so it is the probability density function so this object here cancels with this one and what we are left with is 1 minus s hat over N to the power n minus 2 which goes to 1 minus s hat over N to the power n minus 2 for large n exponential of what positive or negative exponential off as hat positive so 1 minus s hat because if I add the plus here yeah right I'm almost done so is everybody happy with this limit one minus something divided by n to the power n coffee [Music] so I assume that we are all happy with this and here what's what's happening here if we retain just the first the leading order in n this is just px of X so for large n this object is just px of X times exponential or - s hat ok this is valid for large tent let's say okay now we are basically done because we need to yeah we need to inject this object into this formula here to find what what's happening in the case of large n for the PDF of the spacing not conditioned on the position of any of any particle okay so if we do that we have that let's say P n halves of s handsome so this D probability density function of the new variable as hat which is equal to the pn of s equal to s hat over and px of X multiplied by d s over the s hat this is the law of change of variables for probabilities and if you do that by inserting this formula here so we have an integral over the X px of X which is this this object here then we have 1 over P X of X which comes from here there is also a factor 1 over N so this factor 1 over N and this object here comes from this derivative and then we have an extra P X of X exponential of minus s hat which comes from here so this guy goes away with this guy this guy goes away with this guy DX the integral of DX DX of X is equal to 1 because it is a normalized PDF so the final result is that in the scaling limit entry and much larger than 1 let's say the space the PDF of the spacings in unit of n px of X is just an exponential so which means that if we compare it so so this one is the beginner beginner surmise this one is the scaled low for the PDF of the spacings of iid random variables so you see that there is a massive difference in the region of small spacings the Wigner surmise tells us that the eigenvalues don't like to stay too close to each other for the case of iid random variables no matter what the PDF of each individual random variable is so this is a completely universal result in the scaling in the scaling limit the probability of finding two particles very close to each other is the largest actually and and the origin of this is the fact that if you have a certain PDF that is peaked around a certain point most of your darts will fall in there so they will cluster they will not recur so this this type of scheme is what is normally called like the difference between Vigna dies on statistics and Poisson statistics so Vic nerd eyes on statistics is a fancy name for the statistics of of stuff that ripple and Poisson Statistics is the statistics of stuff that that attract each other yeah of course that's another example of a badly probably badly designed name because for some statistics has nothing to do with a Poisson distribution it is the distribution of a Poisson process which is an entirely different an entirely different thing but as we can see this derivation has nothing to do with stochastic processes it is a completely you know universal first principle derivation which does not require any assumption on the specific PDF of the individual random variables yeah excellent question the beginner surmise in the form I I gave you so s exponential minus s square over 4 or or whatever is only valid for a two-by-two Gaussian random random matrix so it is strictly valid in the setting I gave you the essential features are valid for general and so the fact that the eigen values repel so if you are asking okay suppose that I take an N by n Gaussian matrix not a two-by-two so exactly as I gave you but three by three or four by four can we compute the distribution the answer is yes the the formula becomes well it is exact but not very very useful it is given in terms of an infinite product of eigen values of a certain freedom determinant operator so the formula is so complicated that actually you can only evaluate it numerically but we can in principle we have we have a formula for it but the essential features are the same and actually what what people claim is that the Viggen of surmise for two-by-two is an excellent approach for the spacing of larger larger and matrices so there is a deviation of a few percent close to the close to the top but essentially it is a very good it is a very good approximation strictly speaking it is not true that the bigness of mice holds for for larger matrices well I mean the the the fact is that what you're what you're doing is you're saying I want to separate two effects the fact that the two random variables won't want to stay close closer to each other and the fact that they are forced to stay close to each other just because there are too many of them or they need to all to land there just because the probability distribution is is larger so in some sense you want to unfold your set of random variable and make sure that that they are stretched in such a way that their mean level spacing is equal to one across the spectrum okay so you want to make particles that are too close to each other not because they wanted to be there but just because there are too many of them or because of the dint of the local density there is larger than the local then so you want to make the local density somehow uniform across the spectrum exactly yeah exactly so you calculate you calculate is facing such that after you have made the the density of your particles uniform so you are basically washing out the the fact that two particles are need to be too close to each other not because they want they want to but just because there are too many of them is that sort of clear okay so well that's time for m4 break yeah [Music] please start again good so we've seen something about eigenvalues of random matrices then we we went one step back to the situation of iid random variables now we come back to the to the topic of random matrices so what I wanted to do now is to offer you were a cool layman classification of random matrix models again this is something which I believe is it's very important it is so important that you don't find it on on textbooks that's that's the usual the usual thing so this is my second gift of the day so what why I call it a layman classification because if you google classification of random matrix models then you get into a domain of mathematical physics and it becomes increasingly and immediately very complicated I just wanted to give you like a map conceptual map to find your way on this field without getting too much tangled in in technicalities so we have matrices let's say n by n which are characterized by a certain joint probability density of the entries so in in this this series of lectures I will impose a requirement and the requirement is that X has real spectrum so the eigenvalues lambda 1 lambda n are real if this is not true then we have a second branch of random matrix theory which would take me probably the same number of hours to to discuss so I don't I just don't have enough time so we will discuss only matrices with real spectrum so eigen values that are real the entries can be complex not only that they can be also something else but the eigen values must be real numbers and well actually to be even more restrictive X we will take it as belonging to one of these three classes so X can be real symmetric so the entries will be real number and the matrix will be symmetric this ensures that the spectrum is Israel by the fundamental theorem of algebra and two real symmetric matrices we will associate a number beta equal to 1 okay so when I will talk about beta 1 ensembles we make I mean I will mean matrices that are real and symmetric beta is is called daizen index of the ensemble just definition or they can be complex hermitian leeta equals to two complex arisia so matrix with complex entries which is hermitian now give me an example of a two-by-two complex hermitian matrix sorry yes another one what do I put here sorry zero okay what do I put here my inside what do I put here okay so or you can have like so innervation matrix is a matrix with complex entries which has real entries on the diagonal and complex conjugate entries on the off the arrow okay I'm sure everybody knew this but you know maybe not good or it can be X can be can have quaternion elements it can be a quaternion self-dual matrix which I personally hate so I will not discuss I will not discuss them but you can have matrices with quaternion elements which are characterized by thousand index beta equal to four okay for the beat a quest on three keys then you will need to look up Frobenius theorem on division algebra which will tell you why we have real numbers complex numbers quaternions but we cannot have turn-ins okay so we cannot we cannot define a division algebra with three imaginary units so we will only need to deal with with this three cases which are beat equals one two and four okay now if if this is the universe we are we are working on which is not which does not exhaust all the possibilities but it is just to be now this is the universe that we are working on the universe of random matrices that belong to this categories real spectrum now within this universe we have two classes of random matrices that are of paramount importance so we can we can draw a diagram like like this so the first the first class of random matrices lies here it is the matrix matrices characterized by independent entries so here we have random matrices whose Joint Distribution of the entrance factorizes into the product of individual distributions one for each entry so the peak of let's say X 1 1 X n n is equal to a certain P 1 1 X 1 1 times P n n X and n ok modulo of course the symmetry the symmetry requirement so if we have like a real symmetric matrix matrix then we only need to consider the upper triangle because all the remaining random variables are are just defined automatically by the symmetry requirement ok so independent entries this is a nice class of Opera know matrices characterized by this this property the second the second class of important random matrices to consider well the defining property is a bit less obvious to define I will try to do my best so the second category is called rotational they are characterized by rotational invariance so rotational invariant models they are characterized by this by this property that P of X DX 1 1 the X n n so the Joint Distribution of the entries of of these matrix fathers is equal to P of X Prime 3 X 1 1 prime DX and n prime if X prime is equal to u X u minus 1 so this is the definition knowledge but trying to explain what what this means so we have one matrix X Indian Samba we perform some sort of similarity transformation or our rotation and we obtain another random matrix and this random matrix has the same statistical weight as as the previous one so the two matrices occur Indian Samba with the same probability so this is this is a very involved properties because it it involves like a non-local transformation of all the entries if you if you imagine to write the entry IJ of X Prime in terms of the entry IJ of X well it will involve all the entries of of X so it is a very complicated transformation property of the Joint Distribution such that it turns out after all this operation has been made that the probability distribution of one one random matrix and the rotated version of it are the same no matter what the rotation matrix is so it is a it is a pretty strong constraint okay I will I will show you like with with few examples how this is put in into practice now you've seen here that I put like an intersection between these two classes so you might wonder okay what what lies in the intersection and here this story becomes quite quite interesting so first of all remember that X can be could be a real symmetric complex hermitian or pattern result or so if X is real symmetric then U is an orthogonal matrix if X is complex commissioned then you will be a unitary matrix and if X is quaternion self-dual then you will be a symplectic matrix which by the way I also hate due to the fact that I hate the original okay questions sorry you can't read probably so if here if X is real symmetric then you will be an orthogonal matrix if X is complex hermitian u will be a unitary matrix and if X is quaternion sub-q then you will be a symplectic matrix excellent you don't need to know I will absolutely don't waste not even three minutes of my life describing what our symplectic matrix is because it's a total mess and it's completely boring but I'm happy to discuss it later okay sorry it's just that I really and it gives it gives me physical pain I'm just your but I have more more interesting stuff to to tell you so that's that's why okay good but there is an interesting point here you see this invariance property so X this you have this matrix X you rotate it and in the rotate version of this matrix has the same statistical weight as as the first one so what what does this mean in terms of the eigenvectors of the two of the two matrices you see if if by rotating it like once or twice or three times you don't change the statistical weight this basically means that the eigenvectors of this type of an samples are not that important right then they are not important because a rotation by a matrix that contains the eigenvector of it would leave the statistical weight unchanged no matter what type of rotation so what type of eigenvectors you have so this this class of random matrix models is interesting because the eigenvectors are not that important the eigenvectors don't don't play a very very interesting role because you can always rotate it so forget about the original eigenvectors and go to another basis and you leave the statistical weight and unchanged okay so there is nothing specific about that specific set of eigenvectors of the matrix x because by rotating it you are washing out all the information about the eigenvectors and still the statistical weight remains the same okay so in for this class class of models the eigenvectors are not important for this class of models the eigenvectors are very important why because the joint probability density is factorized this property depends on which basis you are in if you make a change of if you make a rotation in the space of matrices here it will no longer in general be true that the Joint Distribution of the rotated matrix will have this factorization property there's no reason to believe that this should be the case right so so the the problem here you see the interplay between probability theory and linear algebra here we have a property that is probability with proper probabilistic property that the joint density factorizes but this property is intertwined to the algebraic property that you have this specific eigen set this specific eigen space if you change if you make a rotation then if we no longer be true in general that distribution of the entries will factorize this is this the beauty of random matrix theory that you have this interplay between algebraic properties the the eigen space and probabilistic properties that you have this factorization only in this eigen space this this P let's say this this P of P of X and P of X prime so the Joint Distribution so the Joint Distribution of the entries is unchanged so if you look at the two you cannot spot the difference between the previous one and the new one okay now we wanted to to understand what is in in the intersection here I'll give you a hint based on our first and then I'll conclude we can resume later and the hint is the very first calculation that we did remember we did a calculation for a real symmetric matrix two by two and the probability density the joint probability density of this object was the product of three gaussians remember remember the the very first example we had we had independent Gaussian variables but the off diagonal element had a variance that was one of the variance of the diagonal elements so the joint probability density is the product of three options now we can rewrite this object as 1 over 2 pi root pi exponential minus 1/2 X 1 square plus X 2 square plus 2 again this factor of 2 X 3 square now this object X 1 square plus X 2 square plus 2 X 3 square can we rewrite it in a more clever way yeah so this object here is the trace of X square just do it X 1 X 2 X 3 X 3 you multiply it by X 1 X 2 X 3 X 3 so you find X 1 square plus X 3 square X 1 X 3 plus X 3 X 2 X 1 X 3 plus X 3 X 2 X 1 square plus X 3 square and if you take the trace this guy Plus this guy reconstruct the factor of 2 that we we have here so this object here so that the Joint Distribution of the entries of your of our matrix we can write it in terms of the trace of x squared and now bingo so we started off from a matrix in this in this region because it has independent entries but we have rewritten the Joint Distribution of the entries in terms of the trace of X square so if we now perform a rotation in the space of X this the peak of X prime becomes proportional of exponential one our trace of u X u transpose Square and this guy here is exactly equal to the exponential of minus 1 of trace of x squared so by performing a rotation in the space of matrices we have left the original weight unchanged and this only because of this factor of 2 again if this factor of 2 was not there then we couldn't have written the the Joint Distribution of the entries in terms of traces so this hint tells you that this type of matrices lies in the intersection so it has independent entries but it has also be rotation the property of rotational imbalance so if you have a P of X that is proportional to exponential minus 1 alpha trace of X square this lies in the intersection it has independent entries but also the property of rotational environments so now the question is are there other models in the intersection and unfortunately the answer is negative so there is a theorem which is a theorem life presents like and Porter very old old theorem you can find it reproduced on page 6 and 7 of the of the handout so it is published in a basically unknown Finnish Finnish Georgia so in this this theorem is basically proving that the only ensemble that has independent entries and rotational invariance is the Gaussian and some so an ensemble of this form for any end this is this bad news right it means that you have to make a choice if you want to have independent entries you will lose typically rotational invariance if you want to have rotational invariance then the resulting ensemble will typically not have independent entries there is only one ensemble in the intersection [Applause] page six and seven of the hand okay I think it's it's time to wrap things up and we'll continue today at 2:30 right okay thank you
Info
Channel: ICTP Condensed Matter and Statistical Physics
Views: 38,028
Rating: undefined out of 5
Keywords: ICTP, Abdus Salam International Centre for Theoretical Physics, Statistical Mechanics, Disordered Systems and Neural Networks
Id: Je4bU3g_QGk
Channel Id: undefined
Length: 96min 39sec (5799 seconds)
Published: Wed Dec 13 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.