Pillai: Cramer-Rao Bound for Multi-Parameter Case

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
from you have multiple parameters in other words be introduced oh so you have in case one is where you have just one unknown let's say one angle of arrival in your case so the case two is suppose there are two sources and two angles two unknowns same day does it so we try to estimate both of them the question is is the introduction of a new unknown how does it affect the error or the first unknown will it increase decrease stay the same anyone if you if you have more unknowns what happens is it to the what happens to the variance of the first unknown now increase or degrees what [Music] remember you have a situation where there were only one rule now you notice some a second unknown third unknown exit drawer comes into the data set and you do you do the unbiased estimation for each of them separately and then one measure is the Camaro bound for each of them so the first situation is walk only one unknown situation be multiple unknowns still the one of the first unknown is the same but then you have other unknowns so what happens to the variants or equivalently the Camaro bound on the for the first unknown [Music] remember you made the scene more complicated there are more unknowns [Music] what is it Austin we didn't chain you know it big change it didn't change for MU but it changed for Sigma squared is poster known is the same and yeah so but that's only one example so in general we would expect that once you introduce like common sense would say once you introduce more unknowns it can't get better right it can't get worse so we want to look at that situation it can get worse I mean I know you you're showing an example where they didn't get words but you can't go to the other extreme right saying that it would get so you have and we will take the general case you have multiple unknown parameters so we will have a little bit of matrix theory so again the same data set X 1 X 2 etcetera X n let's say I ad or it doesn't have to be but and so you have the joint density function of all the data points and I am going to call theta theta to represent this vector so in this case we'll have let the t-rex be the an unbiased estimator for theta I so t1 t2 X and draw are the unbiased estimators for theta 1 theta 2 etcetera theta K remember so this is very similar to your the problems you guys work on there are a number of sources or coming at you data sources so in general this is will be also unknown but let's say you know the number then there will be this could be the angles of arrival literally sources and so from the data you come up with one way or the other way or even this is very hot getting generally unbiased estimators but let's say we have one biased estimator and then you have their variances right so anybody what does the Kremen rockbound say in this case anyone what is the bar so you can say the variance is greater than or equal to what except how will you find out the what is the grammar raw-boned here yeah but remember that's not the problem so we have to deal with the problem we have so this is you can see all these are nodes are simultaneously present so all we have access to is this joint density function of the data it's already a function of all of them you can't eliminate this you can eliminate these random variables by integration if that's what you are trying to say so in this case what happens is you can you can of course do the so we will assume that the the regularity conditions are true with respect to any one of these parameters okay in other words for example the derivative of the density function is the same as etc for any one of these parameters equal to 0 etcetera so we will assume the regular so the question is we want to find this out how the additional parameters so you can say there are K estimators they are random variables so because we are dealing with the same mixed problem it's natural to assume that these estimators will be correlated so you have to look at their covariance matrix so the covariance matrix of T X so let me define T first T of X is this T 1 minus theta 1 T 2 X minus theta 2 or you can simply define it to be but these are the unbiased estimators for theta so let me define theta to be the column vector of so theta is this vector so the covariance matrix I I don't know whether you remember this this is going to be T of X minus theta so this is a vector this is a column vector x transpose T of X minus theta transpose I mean in the real case otherwise this will be complex conjugate etc so this will be this will be a K by K vector I mean K by K matrix with the so if I call this to be R IJ r matrix so it's I jet if I call this to be an R matrix covariance matrix then R IJ would be expected value from here TI x minus theta multiplied by t JX minus x theta g so that's the cross covariance between the is estimator and the j8 estimate and what will be the diagonal and risk anybody RI I will be expected value of T IX minus theta the whole square so that's this that's what we are interested in these are the variances of T IX we want to get a bound on this see this is what we are interested the variance of the t IX that will be that will be along this entries and this is a non-negative covariance matrix as you know is always a non-negative matrix because it's a vector x vector transpose right so let me also define again motivated from the Camaro bound for the single parameter case I can like I see we have the logarithm of the all the data sets right so I can take this the derivative with respect to or any one of these parameters right theta 1 theta 2 etcetera I equal to 1 through K right K parameters and remember these these functions so let me define a vector to be d log F F is this joint density function with respect to theta 1 D log F with respect to theta 2 except for D log F with respect to it K I put F just to show that the Fisher so that's a this is a random vector because this is a function of the XS so we can look at the expected value of f or in stuff F let me call this Y that's fine so this is also a non negative definite matrix right in a random vector x transpose expected values and non negative in fact we will assume it to be a positive definite matrix and this I'm going to call it as this one J and I'm going to give it a name so this J is I'm going to assume that the density function and the parameters are such that J is non-negative different I hope you see why J's because if you take if you take any other vector C C and do jaesi star J so etc this is C is expected value of F f transpose multiplied by C but then bring it as easy any arbitrary vector not random so bring it inside then you can write this as C star F multiplied by F star C so you can see the whole thing you can write it as this is a standard to procedure Z star of the whole squid F is random but this will be some inner product random variable but being square this will be always a positive quantity that's how you prove the positive definiteness of J so J is a positive definite matrix any questions well right but we are going to assume that the density function is such that with the with respect to parameters that it is actually boy in general you're right it could be having him any the covariance matrix of any in the random vector could be non-negative it could be in general of course it'll be non-negative definite of course the rank is equal to naught F F F is a random vector remember what matters is J that's that's where all the information is so let's look at what is j IJ the ayat and J so that's the expected value of D log F with respect to theta I D log F with respect to theta J so these quantities remember J is like this so this matrix is like this J 1 1 J 1 to j 1k J 2 1 so our assumption is the data and density functions are such that this is a positive definite matrix and the IJ entry is so look at the I I the 10 rays so for example the I I entry would be I and J would be the same so here you have D log F by DT what we have done but the first time right so if you only have one entry this is just going to be the Romero entry remember expected value and this is a positive quantity because this is square here so why am I doing this why I have TX and I have the J now let me so let us look at this fundamental equation so let's expand this this is going to be what integral tae by X minus theta I multiplied by D log F right theta J so that's good DX fit depending on the multiple integrals so I have two integrals like TX my T X this expected value so x f of course right so let me expand this so this is going to be TX B log F D theta J FX minus theta I goes outside if you want B log F or D theta J if DX will use the regularity conditions before that if you want you can write one more stuff d log F by D theta J I'm just going to rewrite it here is going to be 1 over F multiplied by DF over D theta G and the same thing here right so 1 over F multiplied by DF over DT that now f cancels and now we can use the regularity condition so this goes in outside so d theta J and this was Ti right so this is t IX d FX dx- look at here the the regularity condition this is 0 because this derivative goes outside so this is minus theta D by D J of the area under a density function which is 1 so this whole thing is 0 so you will get a beerus this is this being an unbiased estimator expected this is the expected value of TI that is theta I so you have the the partial derivative of theta i with respect to theta J so what I want to say is you get this result this is 1 when j is equal to i this is 0 when j is not equal to i all right so we have these two mattresses so let me show you let us look at okay what is it for any questions yeah so this is t1 minus theta 1 theta 2 theta K so these are 0 mean right T so now let me define Z or something to be let me put one vector below the other so remember this is a K by 1 vector this is K by 1 so Z is 2 K by 1 I'm going to look at Z's again a random vector because T is random f is random so we look at the both are 0 mean right this is 0 mean what is this - this is the end of T is here this vector yeah what is the expected value of here vector right so what is the expected value of F everybody expected value of T is 0 because these are all unbiased estimators look expected value means what we went to we went through this right regularity conditions because expected value of the first term which is multiplied by the density function so this is 1 over F DF by D theta 1 multiplied by F F cancels derivative goes outside it is 0 so f is also 0 so if you look at the covariance matrix of Z covariance matrix of Z is simply expected value of Z Z transpose yes because the mean is 0 mean of Z is 0 so from here mean of expected value of Z is 0 so that's the co so the covariance matrix is non-negative definite but now let's look at the covariance matrix of Z so that's going to be T F multiplied by T transpose F transpose so that's going to be expected value of TT transpose [Music] yeah and anyone we already have results for a TF transpose if this is going to be a matrix this is yeah it's a look at here so this is me IJ entry of so if I call this matrix to be a and transpose this is a IJ here right this is a IJ so as you said AJ is one only when i equal to j so i hope you see that this is where I want to start I hope you see that I transpose this I so this is my identity matrix wait and this is I'm going to assume that this is non-negative definite all right so what so look at your matrix you have this matrix is of this form like a b b d a.b b d and b is the same as b transpose [Applause] so let me just multiply remember we don't even need to be hit this is symbol here B is I so let me just multiply this with this matrix B is I so what do you get anybody we can well do it directly so first row in the first column what do you get ei - yeah the Laplace rule right - Dean moves first row into second column and second row into first column I also and second row into second column but I think I multiplied by my second go into right wait why do you say so look first row into second first window second problem is are you okay so I can go in the first column look at here I minus I which will be zero right okay now I'm going to call whatever is this matrix I'm going to multiply by its transpose here because remember this is a covariance matrix here this whole thing covariance of Z and so let's look at its turn so what is the transpose the row becomes the columns columns becomes zero so minus B transpose inverse and I didn't remember these in your case B is already J is the same as J transpose because J is already symmetric so your D also a symmetric so this becomes simply D inverse so again so we just need to multiply here I minus D inverse this is not AI minus D inverse but the minus B inverse then 0 i so first no endo so what do you get a minus d inverse first no into second column what do you get zero so I can go into first column zero and second row into so you get D so this is non-negative definite because look this is a simple result right I think you know if R is positive definite s transpose RS is positive that's the result I want to use so what is the conclusion here see from here you can conclude you can't improve but here you can tell who wrote that this sub matrix must be non-negative definite because that's another result of a positive definite matrix every principal minor which is this so this is non-negative definite this is non-negative definite this is non-negative definite the whole thing is non-negative so what we have is what did we have here before - D inverse 0 0 D so we know that this is non-negative definite which we already knew so this is non-negative definite this is non-negative definite but now we have through this transformation a minus D inverse is non-negative so a is what so we have covariance of T minus J inverse is non-negative definite which is the same as saying that covariance of T my shorty earlier be the these or covariance of TI TJ these are the entries - this is j1 1 J 1 to it so J inverse this is a standard notation like if J has entries the j IJ in the denominator subscripts J inverse you can generally people write it like this super spectra so locator you can read the diagonal entries variants of the first one is greater than not 1 over J 1 1 but you have to take the J matrix takes in the inverse then the first entry so the crime are all bound for them when you have multiple parameters so let me so the question was here so the question the answer is here because we're variants of T 2 is greater than or equal to J 2 - so you know the procedure right so the procedure is you find with J so J is this official information matrix whose IJ entries j IJ so j IJ is what we went through this write expected value of b log of with respect to theta i d-- log of with respect to theta J and then you take the inverse and you call that one to be you put it this way then we did crack a variance for the first one is [Music] [Applause] so let's say there is one unknown here so here because they're still there are K parameters rest of them are known so variance of T 1 hat is because J 1 1 is what expected value of look at the d log of D theta 1 this is your old result what we learn the two day two weeks back and here the same problem but you have more unknowns then it says variance of T 1 is greater than or equal to K 1 so anybody you have two points here one is when the other parameters are known whether when the other parameters are unknown in terms of the bounds so it is two different problems so you but in terms of the bond which what do you expect to be worse so can you show that this bound is worse than this anyone so this is a this must be a standard matrix results look at what we are dealing with we are dealing with the positive definite matrix and in fact this is true you should be able to show that for any positive definite matrix one more the diagram any diagonal entry is or if you take the inverse of the matrix and take the corresponding diagonal entry you you're saying which one is greater between these two which one is greater this one is greater than it so you should be able to see whether you have enough skills to show that this is always true this is always true but the question is how do you show this couple of ways you can show this but let me let me show it for a 2x2 case that's easy so we'll do two parameters then you can actually quantify the bigger the degradation due to parameter second parameter etcetera also two-parameter case so J is what J is just a 1 1 a 1 2 J 1 to J 2 because you can see these two are the exactly the same things like J right J 1 J 1 1 1 is what expected value of D log of D theta 1 squared J 1 2 is expected value of D log of D theta 1 so you just have to do it for each this you can do for the gaussian case and this is a again a standard result you can find out the results but I am writing for any general problem and this is a positive definite positive definite matrix that's our assumption otherwise you cannot take D now we need to take the inverse anybody what is the inverse of this one over the determinant what's the determinant right x what is the inverse anyone in order and so the ya-ya negatively so this is j1 one j1 - so you can see j2 j1 one for example is j2 2 divided by j 1 1 I'm going to cancel / J 2 2 and then pull out J 1 1 yes this is true yes or no and this quantity is this if I call this to be epsilon we know epsilon is less than 1 why anybody not convenient this man what is the determinant is positive that means this term is greater than this because it's a positive definite matrix so this is less than 1 so 1 minus something less than 1 is less than 1 or 1 over that is blow up so this quantity is multiplied by a quantity which is greater than 1 so this quantity you know is greater than 1 over J 1 but I want you to do this for general case so that'll case also you can prove it like this for example if you if you write like this you can find the determinant but then you have to do the cofactors and this and that the easiest ways you do the eigen decomposition of J and then J a a would come out to be a somatic mean the inverse of the ayat entry of the inverse matrix would turn out to be harmonic mean and then you can use that relation to show this but here I hope you see this so we have in this case at least yes because this whole factor is greater than 1 right this has to be positive because they are j1 one is positive so this factor has to be less than one and a positive right of course it is positive so one minus something less than one is even less than one inverter that this quantity is greater than one so you are multiplying by a quantity greater than one so this is greater than this and so if you want to know the what is the degradation you can look at so this factor which is positive will give you the degradation of in the grammaire all bound by bringing in a second unknown and what happens if they give me finally what happens if this official information matrix turns out to be diagonal anybody that means the second parameter is not going to affect the first parameter so in design of experiments if you have freedom to choose you want to design things so that they the official information matrix turns out to be diagnosed if you so if the if you have freedom to if your design parameter comes in here then you can adjust it so that this terms of zero except
Info
Channel: Probability, Stochastic Processes - Random Videos
Views: 343
Rating: 5 out of 5
Keywords:
Id: m7nwsrjOtCI
Channel Id: undefined
Length: 39min 49sec (2389 seconds)
Published: Sat Mar 09 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.