Terry Tao (1.1) Universality for random matrix ensembles of Wigner type, part 1.1

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
the overarching topic of the second week is in reality in random matrices universality the key concept not just in random matrices but in the whole of probability mathematical and statistical physics and in random matrices in the last decade there have been spectacular developments in proving decade-long conjectures on various aspects of universality and so we'll hear from the top most experts on the topic our first lecture doesn't really need an introduction is made fundamental breakthroughs in more domains of mathematics than most graduate students take classes in we are fortunate that one of those domains was run the matrix theory our first lecture for today is there it now from UCLA and his title is some universality techniques from the matrix ensembles I can change my paddle a little bit from what I have now beginning patrimonial election so actually my new title okay oh okay so I'm actually a cup of three topics only well we can all loosely connected universality but not as much as on stat I exude initially so the three top three topics of this lecture series will be following first of all these singular values animatron see second is a circle of all my teammates Steve actually here almost be working mostly my ID matrices I'll tell you what ID matrix is in just a second and this topic wall will use as a key input material from the first topic and then the third topic which is not really related is the linear extinct method strategy which is one of the key tools that we have nowadays to to prove universality results for big new matrices in particular but over actually also ID matrices it isn't the only tool you can't do everything with this method but you can do some stuff okay usually you combine it with what other methods to get the best result okay but so today and the first two lectures I think we'll be focusing on these singular values okay so you all know the singular value decomposition hopefully so we have an N by P matrix M I'll say complex numbers and recipe system and then we have a single value decomposition okay and can always in fact it has the product of the global unity matrix n by n times the diagonal matrix or as diagonal head we can give them is rectangular so it will be an N by P matrix will be diagonal the people people often it is big lots of zeros and then another matrix compactly Dmitri okay so you can always factor an arbitrary matrix by to unit rate and essentially diagonal matrix at these numbers are non-negative and you can arrange them in increasing decreasing order okay and these are singular values of the matrix okay so they're very important almost as important as the eigenvalues for to be matrix the submission the singular values I just give the absolute value of the eigenvalues otherwise the relationship is more complicated okay but in practice the most important singular values are they're all important that but but the most important usually are the largest singular value and the smallest singular values so it is the largest singular value of a matrix it's also the operator norm so another way I think about it attach the supremum of MX overall unit vectors so the the largest thing the value is the largest that that a matrix can dilate value the vector here of course here I'm using the l2 norm poise for vectors okay so that's the largest singular value and similarly the smallest singular value is it the in okay so it's if the if the if the most that that it's most contractive that your matrix can be you can also think of it as the normal copy node of the universe until universe if it is matrix is invertible okay so it matrix a square and it's invertible you can also think of it as as a normal universe so there's at least the least thing evaluators how invertible you matrix is if you have a square matrix your matrix is invertible if and only if you're at least assume that i use non zero okay so it's always more negative and they do exactly when you're when your matrix s for rank or in the case of square matrices when it okay so these numbers show up all the time when you're studying other statistics of and matrices so it is important to understand these numbers for various metrics models now there are very there are many matrix models that will care about but for this for lectures 1 & 2 we will care mostly about iid random matrices so you can make the 3 n by P where all these guys are random numbers random complex variables okay which would be identity an independent I distributors so every single entry has the same distribution and they all independent okay it's customary to normalize these of mean 0 and variance 1 and in fact for most of the lectures as the complicity we will focus mostly on the case of the new leave an emergency we're not only IIT but actually our each other random variables is just awesome minus 1 with a 50/50 chance of each ok so it's also good random sign matrices ok so you just take it in by P matrix and just randomly fill your entries with plus or minus once you click leaves of the different coin for each one which one is all these all these entries and you get a random matrix ensemble okay and this is a very typical example of idea of any matrix it's the most common soil of people on ensemble so at the other extreme you can also talk about Gaussian random matrices where the entries here are Gaussian random variables so that's a special case which you can often do exclusively by other methods so as you may have seen in other lectures already when you have gasps team Animatrix the the many distributions that restricted the joint distribution of a single values or the eigenvalues has had some very nice explicit formulas you know in terms of love gases or something and or determinable processes or whatever oh and you can compute these things explicitly but when you leave the Gaussian world and you work with these more common toriel model such as glue the mana matrices then you do not have nice formulas for the distribution of obvious singular values and so forth for example pockmark of food level you know this is the discrete ensemble so this is there's only finitely many different matrices here so you know we get a continuous density function is no longer a joint distribution which is a there's no longer a PDF of the year obviously seeing the values is just that some some discrete probability distribution which has a rather complicated formula which is almost never usable directly okay so we would like to - to have tools to understand the behavior of matrices such as Bernadino matrix yeah okay you can consider many many other other random models you know you know for company dilution when you graph so sparse things absolves hope or basically I mean these models are almost as general as our few random graph I should mention that it's you know I mean it sort of creases and I have my own team and different but these are not omission matrices by any stretch II even when in even and B are equal we're not imposing any symmetry on this on this matrices so so the not symmetric and that actually complicates the analysis or it doesn't it doesn't like X it doesn't actually make things simpler of photos for this for part one what for part two it certainly makes things more complicated in fact that you're not Commission when you start to point Argan values okay all right so the basic questions we have okay so which will be the topic of more today it's yeah I meant and is also given a M n by T Animatrix let's assign them to duty matrix actually okay what is the behavior largest thing the value and the small single value okay so okay so this is the active these questions will turn out to be important for part two so but these are all sorts of interesting questions again in their own right okay now I'm offered to the the the largest thing the value is by far the easier one to deal with and there are many many tools to join us and easier because it's also an operational and it's visit many things you can do top example you you could you could try to understand collapsing the value using the moment method for instance you can you can relate you up in a norm to you can take for example the the trade of mmm star and that the adjoint and this is just the sum of the eigenvalues but also have a single value squared so in particular it found the person knew how you squared okay so you can compute this of second moment type quantity of of your matrix then you can get then you can certainly bound these up about the see the largest single value and you know more generally you can bound any even power of all fitting the value by an appropriate power of M M star and okay and we have lots of tools by understanding these sort of thumbs these are fairly explicit polynomial expressions of the coefficients and so this you can already use and you can take it by taking K moderately large you can get them actually quite precise control on the largest single value by ideas like this but I won't talk about this maybe will be talked about in other lectures the this sort of method doesn't tell you very much about the least single value it yeah yeah you know you would like to a negative moment somehow you want to take you to to control a single value and so this method turns out not to be so so good parsing a second question so we're going to focus on honors on a different method which for the opponent method which gives weaker results in the moment method it doesn't give as strong bounds on the slightest thing the value but it it has a banish that it gives or it can do something you can say it can say something about the releasing of value whereas the moment method can really can't do much about about this particular spectrum okay so let's first talk about this method in the context of the largest single value so the basic theorem we're going to prove here is that om being a matrix like this there exists a constant T positive such that such a largest single value is X smaller than a constant as we then with exponential exponentially high probability okay so what that means is that the public value did not like not less than TC within this actually small it exponentially small absolutely okay so outside of a really really tiny event the operation all of your matrix here this is random Bernoulli matrix is basically root n ok all right so this is this is the first result okay so the trivial bound by the way it is if I give all the entries a plus 1 is the all plus 1 then they may be open only it will be basically n ok but usually you get a square root savings because of the right-hand side okay so yeah so if you use the moment method see you can show that any see bigger than 2 works but the argument I give will not give C as low as giggling like four eight so it's a little weaker but but up to a constant it gives you comfortable results alright so the way you prove this is that you will rely mostly on so we won't use the moment method we will just use this supremo definition of the ulti here I'll just sing the value okay so we will just go ahead and compute this by brute force okay so the probability that the operator norm is big enough you attend it is the folly of the soup overall unit vectors so if you open almost bigger than zero then and that means that there's some unit vector such that M applies that unit vector is also bigger see you then okay so this is this is the property of a union so you take a union overall unit vectors so every unit vector is an event okay so the unit vector there's a chance that that unit vector is the bad single vector that in fact it might be the vector that makes it this this this this quantity big and then you're taking the union of all these events and we kind of show that if ability is exponentially small okay so the most naive thing you could do here at is way too naive is the Union down so okay so you can say oh okay there's a union so I can bound this by storm overall unit vectors so its probability now now the good news is that we individual X this is a cute very small okay so the researcher not happening qualities and we'll get to later that the each individual one with probabilities is certainly extensional small which is sort of has to be because if you want the whole thing be exponentially small so in each of each one to be smaller than its SP dexterously small but this you know the moment you do is you've really lost the game because there are uncountably many points on the scale again you're something unkindly mating point and no matter what what how good a parent here you have on each individual term in fact you can't even do this because the probabilities line can't be additive so you don't actually do this okay so the Union bound directly doesn't work but this is way too wasteful okay because you know so the son kind of normal poison was sphere so there's an uncommon of number events here but many of them overlap each other very tightly you know if two points X on the sphere very close together then say x and y over together and the other MX is big and in an event that what that my is big should be almost coinciding and then using the Union bound to bound events that are almost constant the Union based on what coinciding is very inefficient so you want to improve this argument by not taking the Union by replacing with uncountable Union which you can't buy a union back by some more discrete Union over more separated point x and y so so like you don't sort of reuse the same event over and over again you try to reduce the overlap so that the Union that becomes more efficient okay also please itself yeah so in it yeah PA for the absolutely largest thing value feeds but not me very important so please at anything less than death I did not feel that yet oh yes so for this theorem you could assume without loss of generality that that P equals n actually because if you have a n by P matrix then it is a quartet another minor of an N by n good newly matrix and if you can bound the operator norm of the bigger matrix then you you certainly gonna open all the smaller matrix so you certainly get yes so if you can do this for P or square matrices EP you get up over Tanglin which of these for free okay so if you like you can think of square matrices app you become more important when we do all these things without you but yeah so now if you wish you can think of this 50 squared okay all right so yeah so what we're going to do is that we have it I'm kind of with you next year here okay ah so X is taking values in it so 2x big values in in in CP actually because it doesn't really matter but because it's a Bernoulli matrix you connect you just well up into a sec two two two two real vectors because because this is the but anyway alright okay so we we have this big sphere here so what I'm going to do is going just relax this fear so we're going to pick some some small-scale epsilon a key epsilon you can measure take to reach something like 110 or something you take a small come to epsilon and we will pick okay so we'll let that Sigma you are called a maximal epsilon net in the sphere okay so it's okay so in the sphere you take a finite set of points Sigma which is an epsilon net but the epsilon that means means at all points in Sigma I separated my distance then these epsilon okay so all these points are these halves on a pod and a maximum means means that that that respects what they you can't let any more okay so you got it you can at any other point I make secretly big okay there's no other point on the sphere that you can add to this this this set without destroying the opponent property now we're saying that every point on the sphere lies within epsilon of 1 point on this net okay so that's called an epsilon net so these things always exist just from the greedy algorithm okay he's gone dilemma that's really overkill okay okay just make the greedy algorithm would do it okay so these so I mean it's a bit hard to webinar explicitly but but you can certainly shall it exists all right so so so so what's worked already set so first of all they're not too big like so so this is a cardinality bound we can bound how big these are and up yeah you could be more precise than this but but the cardinality would you can bind it by something which is exponential in hand in T but certainly wouldn't mean it is Bonnard by some constant multiple of 1 over epsilon raised to the power n okay so so if suppose I say 1 over 10 then this we start like central yet or maybe for T for T to the N so the arm these nets they are moderately large then XD exponentially large but they're not uncountably large which was which was the problem we had previously okay so I'll do some bound it's not so great especially when Epson gets small this is not a great bear but it is a it is a bound okay so this this is a very easy bound it follows too much for the volume packing argument okay see what you can do is that if you have this net right every point of net you form a little ball you take the full advantage of some bow - so that I'm every point in this net you take a ball like this and because there's an epsilon net because all points are separated by at least episode on the by the time I'm going to play all these for the disjoint okay so yeah will give you you can fatten up each of these points to go ball and all these balls are disjoint and on the other hand just by the Tri I'm going to call you all these balls will destroy it and they lie inside for example the polar radius - okay that's a bit crude but but all these balls certain lines on the ball of it is - because their Center lies on the you know sphere and the radius is is okay let's the entrance or some fun okay so so we will destroy it so therefore by Counting volume the number of points here that most the volume of the ball of 8 is 2 over the volume of the world ball of videos and absolute you okay and this is some constant times 2 to the N is a constant times epsilon over 2 to the N and then you just divide in fact I even got a precise constant u so in fact you can downsize by 4-valve times we get okay this is not the best down you can do for co-packing but that's not not the point okay you could I mean it's within a constant of optimal actually so it's up to this for this is the best possible and and because I'm not carrying but optimizing is see this is good enough okay so on the one hand we have some country some finiteness some bound on this net and then the second thing is that it's uncountable soup here you can bound by by this accountable to provide highlights do okay so the other fact is that as soon as epsilon is less than one-half the is uncountable CPM is actually bounded by twice finite okay so we had an uncountable soup which is causing us trouble in our first attempts to do the serum but I can down up a factor to which I won't care about because I'm not trying to optimize this see I can down if I this finite soup okay so so this is a okay so why is this true so first of all this is this is also operating on okay so this is actually quite easy okay so the operator norm is this soup okay so super detain somewhere so there's some point we use here excuse compact the openam is the optional author of mx2 some some green in the director okay so so okay now this unit vector this point point of sphere doesn't lie in the net necessarily okay but did didn't then we'd be done reasoning is to but it lies close to it lies close to the net okay so this X has those of them they must exist point in the net which is within epsilon of the of this pocket okay so yes we have one extra produce which isn't on in the necessarily it is within epsilon sample point y which is in the net say just our triangle inequality economic my plus M of X minus y okay now this I can down by the soup over the net and this I can just stand by the operator norm x is the less less vector which is epsilon and most Epsilon okay so as long as that some there is one half I can attack this error term can be absorbed into the the left hand side and giving up a factor too okay so you get okay so you get this down here all right all right and so hence so going back to this CG pop my finances logic signal values so this is not this was the idea I'm kind of a suspect but using this this fact here taking F somebody want to half say yet so so for this publicly the optimal epsilon is one-half so I can bound this pi or we've got the soup over the this net is valid by okay but I need to not be bound by it xq too okay so using this sound I can bound the uncountable soup by this spice soup and now it is just a signature bad choice though patient here but okay but I can out I can now by using Union guy and now I'm not immediately dead yet okay so okay and because because it was cardinality bad so this is bounded by like s4 repligen explosive conjugate to the end like the next mention factor times a soup pull the whole eggs okay so see we started with a probability of a soup inside and now we take in the soup outside using the Union bound and but we have a perspective this kind of fact is I'm let's go to entropy cost okay it's the price you pay because you don't know in advance which X is going to work which I can give me the worst x in advance skip if you if you if you knew it was a picot X which is always a word then you could forget all your part of all the other X's and then you wouldn't lose anything but the the the X which is the worst we don't know where it is but we know it's close to one of the points on this net and so if you want to freeze X and use one book of a single exit to pay the entropy cost which is which is the symmetric entropy of office set the number of points on the number of really of genuine existing points in your parameter space okay so let's just as a Christ have to pay but as as long as the bound here is exponentially better than the entropy cost you you can still win okay so this is now a much simpler problem we just need to estimate a single event like this rather than a union okay and this can be handled by or by various techniques so alright so first of all I can square it yeah so okay let's see so so 2m come by P matrix okay so X is it what was this P by 1 matrix vector X and you can think of it as a bunch of rows so I can think of this matrix as a bunch of independent rows thank you and each one of these rows just a random element of a discrete cube listen minus 1 plus 1 to the P so the entries of titus random sign but also because the entity ID matrix all these rows are independent so these are uniformly distributed on the cube and independent variables so so M X is just n by 1 vector and it's it's at the dot products so this vector is just a whole bunch of independent random variables and each whenever both of the dot product of a random vector with a fixed X ok so sorry so X is due to missing others just ate a bunch of numbers you talk to some number with a whole bunch of random vectors and then you do you add them all up okay so this is ok so I like to deal with the same thing I think that the probability the sum of X I X squared is bigger than T squared and 4 okay so I just Square I just square both sides use the Tigers okay so what I got here I've got a probability that the sum of a bunch of independent variables is I need a large deviation estimate what is what is the probability that suppose that that this this time is very big okay now we have no shortage of tools to understand sums of independent variables in probability so you know at this point you can pull out your journal for your house be hosting in value or whatever so okay so okay maybe it means the time this is covered in the note that the maybe actually just due to say for the time so the way you prove things like showing up inequality usually is that you try to control this by terms of exponential moment so you can you can bound this by the expedition moment you take some exponential of calls from colonial divided by X good for okay so it's just Markov inequality and the point is that this is how you can compute because because this is some factors quite nicely if you see actually work in elastic it's because of all the independence so this time you can compute and it's a trooper in for for for any T between 0 and okay and so this is increasing it is decreasing T this
Info
Channel: PCMI A Program of the Institute for Advanced Study
Views: 16,133
Rating: 4.9801979 out of 5
Keywords: Institute for Advanced Study, IAS, PCMI, Park City, Park City Mathematics Institute, Random Matrices, math educators, Terry Tao, Nick Cook, universality, random matrix ensembles, Wigner, Erdos, Schlein, Yau, Yin, Vu, asymptotic, universal limiting, universal limiting asymptotics, microscopic scales, Lindeberg, Lindeberg exchange strategy, local relaxation flow method, relaxation flow, Tao, Terence Tao, University of California, Los Angeles, UCLA, New York Times
Id: uqZ1xECT8FE
Channel Id: undefined
Length: 33min 25sec (2005 seconds)
Published: Mon Jul 03 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.