so good morning so today we go as you now gradually start to see is this course of course we treat various aspects of what we call quantitative risk management in a sort of a very fast mode I still remember I stood in one of these rooms it might have been this room about 20 years ago King giving a full course the summer school on extreme value theory now I'm giving a say a one-hour lecture on extreme value theory and of course you know that we wrote a book with Barry kamikochi in 1997 on modeling extremely venture insurance and finance and again in the spirit of this course we want to explain to you tools and techniques which we think are relevant some of you may be more in contact with some questions on on aggregation dependence orders may more be using tools from let's say time series analysis others may be using aspects from extremes extremes I think are extremely important in risk management because risk management is about understanding extremes so I think any course I often say the little section in the book which i think is extremely nice section to introduction to extreme value theory for me is a finance 101 you know the American system this is every student all over the world in econometrics or finance should know this because too often we have heard statements like well that's a 1 in the lifetime of the universe offense or the world is non-gaussian and so these are all blah blah blah statements so this section is about there is a theory out there which allows you to talk about extremes allows you to understand it is not going to make your life easier on the contrary it makes me make your life more difficult because the main pedagogical opposite of a setup of this particular chapter of this theory is to warn you asking extreme questions if you say I want to measure operational risk for those who know it a 1 in Thousand Year event a value of at risk at 99.9 level one year then it's essentially extreme value theory that can tell you forget about it if you give me high quality data I can do something on basis of the data available this is a nonsensical question although we have written various papers on it but the extreme value theory is there not to say oh now I've got the magical theory that allows me to answer these questions and believe me in the beginning where we were introducing this field into finance in the 90s people thought now we have the tools available for finally answering these extreme questions of these very very high return periods it's the contrary extreme value theory allows you have sufficient data but it's tells you very carefully this is the end of the line ok that's I think the main message I want to give to you and so we give a very short introduction to extreme value theory first of all and I often when I teach my students and say I've got poker team I've got models in my right pocket I've got models in my left pocket in my right pocket are all the models that have to do with sums so I look at iid random variables and these independent identically distributed condition believe me we can generalize it that's not an issue let's just again as I said yesterday let's first walk and that we can run so if I have iid random variables and you're interested in the son or the average then we all know if you want to calculate the distribution that's very difficult you can basically not do it exactly for very few examples but you can approximated well you can approximate it by the Gaussian if you have finite variance if you don't define that variance you get the so called stable distributions so there's a well known class of models that approximate statements on Serbs this is incredibly fantastic theorem a main issue I want to press here is this approximation of questions on averages of some average behavior in markets is governed by the Gaussian distribution in continuous-time is governed by Brownian motion logarithmic Brownian motion whatever you want to think about this holds in the case of this theorem the central limit theorem for every distribution function of my data you only need finite variance and I want to present for every however your data is distributed your model there if you look at some skill this great for those who know singular this works that's total magic all right so do you know this world what we do is we sort of normalize by the mean and standardized by the standard deviation and it's it magical result is that something like that from Maxima because I said now I put the normal distribution stable distributions in my right pocket and if I have a question about X sums this is my model and this is used every day every millisecond all over the world in every field of science now I've got questions about the largest event records the Olympics are going on now what's the probability of a record being broke and this and this and that's a sports records can I use the Gaussian distribution well you shouldn't you're not talking about average behavior the Olympics is about the best behavior so you need different models so rather than looking at the Sun and now look at the maximum I give us an example to maximum it can be let's say that the second large of the fifth-largest records for all just let's look at the canonic example the maximum and again is there a standard model out there well I cannot get it and if you if the first time you hear the story I'm sure you cannot guess a solution because it's a non-trivial solution but in my left pocket I need other models how do I find them i mimic the central limit theorem I say okay perhaps if I normalize it like the mean in the standard deviation in the central limit theorem perhaps I can find some limit just like in the central limit theorem and this edge if I can find it these are put in my left pocket that's an asymptotic model for talking about extremes it's identical identical questions now here I mentioned one other result here I will call X F the upper endpoint of a distribution it's clearly if I talk about records or extremes in uniform 0 1 distributions random numbers then the upper endpoint is one the largest observation is one if I talk about the exponential the upper endpoint is infinity so XF and I call the upper endpoint clearly the upper endpoint plays an important role finite or infinite and clearly if I sample the sample and sample and I look at the the largest observation of course it will keep on moving to the right and eventually it will converge almost surely for the mathematicians to the right import it's like the law of large numbers but from Maxima doesn't tell us very much we really have to find this here is there because we're in trouble into the interesting and risk measures and return periods so mere interest in the distribution of the maximum which of course I can trivially write down because it's an iid sequence so the probability that the maxim is less than X is the nth power of the marginals so it's trivial but of course I don't know the marginal distribution so I need an asymptotic this to be all right this is just a pedagogical statement is that there's good and bad news about extreme value theory the good news is the distribution of the maximum itself I can write down is the end power of the marginal that's the last good news you get from me all the other news is bad although I can solve the problem but it's a very difficult world to work in ie asking questions about extremes which the whole world is increasingly about is a non-trivial world right so we can solve this if I have data iid data I look at the maximum and if I can find normalizing constants whatever they are so that I have a limit distribution like in the central limit theorem but now for the maximum then we use the standard notation F which is my data model I don't know the distribution I may have a model of myself estimate F is in a maximal domain of attraction of this limit distribution the maximal domain of attraction it's like the data in distribution are attracted to that model in my left pocket the edges are in my left pocket now what are the magical distributions in my left pocket perhaps the Gaussian believe me it's not a Gaussian although in industry till still every moment of this day are still trying to convince some of us that well we can use the Gaussian model even talking about the extremes it's wrong it's totally wrong all right so so what does this edge look like well here it is so I empty my left pocket and you got all the models there is no other model out there that gives you the normalized limit distribution of iid Maxima as those here muddler location scale I can always change the location of my models and I can also always scale so I can replace X by X minus mu over Sigma for some use from Sigma I'm not talking about meanings the deviation here so I've forget about location scale we always look at standard model and how does it look like well it looks pretty explicit that's still good news remember the Gaussian distribution stable distribution for the sums you cannot write down the distribution function explicitly here you get explicit distribution function driven by an important parameter X I is called the shape parameter that's the parameter you're after and this can be anywhere in our and the model is written in such a way that if psy goes to zero you see the famous euler's theorem you get it's double exponential now you see already you get very strange different distribution function if you never have seen them you say oh well how does that Squatch that's what that's cool and it's very different from the Gaussian alright and so if X is negative you have to be careful because these has to be less than one so you need this positive you can say well that's a little comment well no that's not just little comment that's highly disturbing comment because now you see that for negative side which is Prince is just accepted for often for rainfall data or metallurgical data you encounter you see that the distribution function you're looking at where it's nonzero and 0 depends on the unknown parameter that maximum likelihood Theory doesn't like because that introduces non regularity possibly so statistics I said there's quite a lot of Nobel called bad news we can solve it but you think carefully so what let's not leave the main point is this class of distribution functions murtler location scale is the only class of distribution functions you can find that solves this problem that's really a wonderful result that goes back to Fisher and tippet in the 1920s so nobody can set we didn't know about the non the non Gaussian world you know the non Gaussian world of extremes goes back to the 1920s which sir ara Fisher the founding father of modern statistics so it's really there and of course they worked at it because it was very much an issue of importance when one of the early works that people looked at is the strength of fibrous material I think it was even one of the early paper because on wool the strength of wool so if you take a piece of wool in spinning and spinning and you stretch it or catch it fibers and you stretch it stretch it stretch it until it breaks when does it break when the last breaks not when the average thread breaks that's a Gaussian then oh it breaks when the last fiber breaks so the extreme value question was extremely natural and that's how so Ronald Fisher and tippet wrote the early paper so that is covered versions of this of course as a whole school that then okay the names you have to remember so in my right pocket sums the GAO and the stable distributions everything to do with sums averages in my left pocket it's the Bible the Gumbel and the fresher the freshest for us the most important that's why this is a orange colored because that's very often that the model be uncounted finance the Gumbel is a double exponential the variable the wave will always have a finite upper endpoint so if you have data which have a clear upper finite endpoint the beta type distributions the Bernoulli distribution just even just just find that upper endpoint that's immediately the Weibull model it's not clear if I say well which of these three limits corresponds to the exponential to the look normal to the Gaussian as if I my model for the data so if you look at that say look normal data which is the standard modeling finance look normal returns geometric Brownian motion if I start with you start simulating we could maybe can't do it now unfortunately but simulate look normal data look at the mugs with normal eyes it will tell you how and look at the shape of this limit so look at this here what happens here I can tell you after a while if you simulate but simulate a long time I'll come back to that you will see this maximum normalize finally settle in a distribution function which looks like this here which looks like it's skewed to the right it's bicycle sided is the Gumbel so if you're interested in finding Maxima behavior of Maxima of gambled of log normal distributions believe me this is the shape the limit shape you franso in my left pocket I must pull out the Gumbel distribution and if you want to talk about the extreme events related to the lognormal that is the limiting model you use and that's kind of magic isn't it but why would that be the same would be true with the exponential the same would be true with a with a gamma not the same is true if you take the Perito not the same is true if you take let's say the uniform distribution that's where things will get difficult but anyhow let's so so now the question is what kind of data correspond to what kind of limiting model and now I come back to the only statement in iid data with finite variance the Gaussian distribution the normal is always the limiting model for sums always and he already you see there will be differences so the whole part of the series understanding that we'll just do a bit so this is the Fisher of the famous Fisher typical Inca theorem it in words is just saying the Mobile's I wrote down here here these here these are the only possible limits you can find so this is like our Dowson stable universe from maxima you can't find any other limiting model it's nice now to stop and give you a couple of hours I cannot give you the full proof but an idea of the proof because it's one of these magical results anyhow not in this course so I even say it's non-trivial quarreled of work okay so these are called the generalized extreme value distribution that's a name nevermind the normalizing some constants we come back to them later we can always chose them that we have the standard form okay no location scale that's a detail now comes a very interesting statement all commonly encountered continuous distributions are in the maximal domain and attraction of some of these of one of these but it's commonly so if you you can test me we will not do it if you commence your distribution the Bur fresher the exponential Gumbel the uniform Gumbel the log gamma fresher whatever distribution you throw at me with the name I can tell you what is the limiting model you take out of my pocket to solve extreme value questions now somebody says suddenly Poisson you mentioned parcel you didn't let's say you would mention personal though that I cannot treat in standard Theory that's pretty amazing isn't it the most important discrete distribution doesn't fit in my theory geometric no negative binomial no mainly extreme value theory for discrete distribution ie if you doing it in an experiment let's say bioassay in biology and look at counts counts of let's say particles in the system or counts of a typical buyer si and you won't say I'm interested in the maximum called questioned the maximum number of counts when you do disease modeling you cannot use Thunder theory you must come to me and I will tell you what you have to look this is intuitively a little bit clearer because talking about the maximum you must define the maximum the problem with discrete distributions you may have the two independent ones identical observations with probability strictly positive and so that means how do you order data so just a general comment with commonly and these continuous be careful so if you if somebody comes to yourself and of course in finance it's typically in credit people use of being discreet models even their newly models default don't default or or use Russell models be careful alright so that's how judgment in the continuous case everything is fine now you can work it out I will not do these exercises you can do them yourself everything is on the slide let's say I take a Pareto distribution there are many versions of the purrito with parameter theta so X to the minus theta cap as a parameter so it's a typical power law is these are the bread and butter models in modern finance we all know that the world out there is power like not Gaussian so it's X to the minus theta when we give you the norming constants you can ask me how do you find them I give them there's a whole theory for that and you see this has nothing to do the norm in concept with mean and standard deviation they're just different which is also natural and you do the calculation so you want to see go back to this here does this converge to the limiting form if it does find find the limiting form if you do the calculations I'll leave that to you tonight and you will see you get the fresh out already set Baretta goes to fresh a so just do this little exercise it really shows you how the standard models come out you could do the same for an exponential if you can guess the normalize the exponential you can get the normalizing constants you find the fresh adder the gumbo okay this is a slide I just want you to have heard about it first of all I want you all to remember when you walk out of this room one famous mathematician that very few of you know you've and Cara Mata who has hurt in these audience about you've and Kalamata one well two three some of them some of you may have been in my courses even Yavin kalamata kalamata delivers the mathematical theory for understanding a lot of the structure in the background of what the extreme value theory and also the some theories all about it's all about slowly varying functions forget about it for this course it would bring me too far but my friend and colleague at Cornell Sid Resnick he once wrote a paper and with the title how a slowly varying function can ruin the risk managers night sleep it's there I will say it has to do it rate of convergence let me so there's a definition here think of a constant think of a logarithm but I will not go into detail what I want to mention you is this theorem Boris Vladimir I think correct did anchor one of the former colleagues of you at Lomonosov he wrote a fundamental theorem in nineteen forty-three I think where he showed that F F is my data are your returns or your losses your data is attracted to the fresh the fresh eggs i positive one of the left pocket models if and only if that's what we like as mathematicians these are the nicest results if and only if my data model f bar is always 1 - if I look at the right tail so the probability of having a loss larger than little X in my data model is a power x to the 1 is 1 of X I with this magical car Amata function slowly varying function think of an constant asymptotic constant think of a logarithm but believe me I can think of many more complicated models but they all very slowly much slower so the main part is that this is even only if the whole of finance basically lives in this world the L function is going to ruin your life I remember a telephone call in the we started to introduce extreme value theory at risk lab to finance in the mid 90s and one day I get a telephone call from a trading desk in London I assume though the bank is not important but the bank is they said well we're sort of pricing options based on derivatives sort of options based on extremes like am a great option or some option where the Payette is a maximum over several underlying so it's a maximal question wait we do simulation but i said it confirms us very very slowly so yes that's a problem extreme value theory from the model from the data the maximum normalized to the model in the limit this can be very slowly and if I say very slowly I mean there very slowly as slow as I can make it by changing the L and you don't know the L in practice no way let me tell you if you want to look at the convergence for the exponential it's like lightning you're immediately in the limit the Gaussian extremes goes faster to limit the gumbo the early Pareto goes faster the limit the one I showed you believe me if this were their data in no time you in the limit if whoever Vladimir comes to me he said no no I work with a la cama by the way the log gamma is a standard active aerial model for insurance claim there's famous papers by Nord bearish and Rambler Hansen in the 70s 80s on glass breakage and wind storm damage the low gamma it has a power tail but now the L function is pretty unpleasant a logarithm because they're so what well you simulate now it also goes to the fresh air but very slowly you need loads of data the T the right tail of the t distribution is a power there the L function is like asymptotically a constant goes very fast you see that's a problem I just want to mention that if you look at the central limit theorem convergence is always the same rate you have this 1 over square root of n you can look at various seen etc it's uniformly fast this completely depends on the underlying assumptions asymptotically you don't know this in practice so this is already one thing you have to take away with you and I cannot help it it's just one of these little boxes you have to open if people talk about extremes a warning it may converge very slowly and I can't help it ok this theorem already brings us to a first important tool I'm just briefly mentioned that and we'll see later when marios takes over and I must definitely stop in time we'll see just to tell you I can open a little door and say look you just peep into the room and there's a statistical technique there this is called the block maximum method and it comes from environmental science you look at your data x12 and I call it the data length I call it later little anytime's little m you'll see why you have a 12 observations now I divide my data into blocks so you see a first block II of 12 observations a first block of equal size three of them three of them three of them three of them you may take your financial data and do yearly blocking 260 or 250 260 260 260 observations or you may do quarterly block whatever so you block and you now you see why it's going to environmental science in environmental science the blocking is natural you have seasons a yearly blocking a quarterly blocking there's a kind of a natural blocking in environmental sign I do assume that everything is our ID I might do beforehand so if there's really no stationarity as you saw in yesterday's in Alex's data the ozone level or I think was it was of course you first did seasonal eyes and so and then do extreme value theory on the residuals iid now for each block let's say this is a year I only have 12 observation but say this is a year I look at the largest the largest observation of my three data points in the first block and look at the largest in the second book the third and the fourth so I reduce my remember I'm after a model for extremes of course you can say well I've got my data look at the extreme value they don't you have only one observation now even if you're very very optimistic doing statistics based on one observation you shouldn't that's where the blocking comes so let's make several versions of the extremes here I got four of them already now within each block if I have sufficient data in the first block let's say hundreds of 260 the largest of 260 and may already have a limiting model so I may have in each of these blog that is read maxima can be approximated by a distribution in my left pocket let's say one of the fresh air if so then at a red level I have iid data from one of these left pocket models let's say the main parameters X I tan I can do maximum likelihood because I know the limiting distribution normalized the normalization I can take care of don't worry statistically and I can estimate I can estimate maximum likelihood now you see the main problem you want at the same time enough data in your each block because you want that this of the distribution of the red is close to its limiting distribution that's where Fisher tippet enters okay so you want to have enough there at the same time you want to have enough red Maxima so you want to have enough red blocks why because if you look like that you do maximum likelihood Theory a maximum likelihood also basis its results of asymptotic theory if you do confidence intervals improve efficiency and all that and then you still hope that everything is regular believe me little comment if size is created that mine is a half the world is fine close the parentheses all right all data I have seen I personally have seen size always greater mine's a half so I'm negative the Bible we typically find in environmental data or data that's clearly constrained at the top the data I often see very often always in funds and insurance is fresh excited Eve so the likelihood were but now you see the problem you won't have many blocks in each block many data but of course you only have thousand observations these to fight and that's a typical variance bias discussion there's no easy solution we will see that later surely perhaps in the examples I'm promising a lot in your direction are you sure you better start working alright so I think but I hope you'll really see with this one theorem I can already do stuff okay when I do that here so this is the theorem I look at the maximum are normalized correctly suppose I know them so I have to subtract and divide on the right hand side also and when I write that I do statistical approximation I say okay I know in the limit this is an edge with X I and of course now I've got here location scale now you see I forget about the normalization these are two extra parameters and it's fixed right so you see that I've got now the limit is X I which is a shape there's a location parameter let's call it mu and the scale parameter so it's a three parameter explicit three parameter distribution the good news don't like the Gaussian of the stable office table so now at the level of the red book I now do the fitting maximum likelihood and I get estimates out okay you can write down the likelihood because I can write down the density you can do whole likelihood theory it's explicit so it's straightforward to the program well to write down it's not straightforward to do the optimization because these distributions are difficult and exhibits are negative you'd be careful about regularity but that's it it can be done so I can write down the likelihood these things I've mentioned and there is no general based strategy to find the optimal block size it's a it's a it's a fight between bias and variance which you often find in statistics in French if you do spectral analysis for time series and you won't want to find periodic the periods is always your window carpeting brought window narrow window variance bias the whole of statistics interesting statistics is full of variance biased questions that's a theory now what are your questions you'd like to ask me well give me for my let's say the standard tempore so many data gives me a 100 year return estimate and you come to me and say I want to find out what's the probability that next year's data of your of your model you're looking at you will be a record even the Olympics is a very nice paper on the Olympics for instance what is the probability that a massive record-breaking in one of the field tracts really fits to some extent in the previous data and very carefully formulated in formalizing this question is this acceptable and I will not give examples although I have examples is this an acceptable record given previous data right so that's a question Rick these are your questions this you can answer now again I will go fairly fast to this section this section now but I just say that the kind of questions and Marge's we'll come back to that later you can look at the level based on now on this theory you can start answering the following questions and believe me if tonight you sit down and you just read through these slides or in the book at the relevant section you'll see it's fairly straightforward but let me just concentrate on the question say you're interested in the level R and K which is expected to be exceeded in 1 out of every K blocks of size n each block is size n I have M blocks and now you're asked for instance well what is the level of my data which is expected to be exceeded one out of every K blocks of size and this means if you think about it that your maximum larger than R this is the matter of one block larger than this number you're looking for is like Dyke Heights it's like constructing the Dyke height it's given small for the Dutch dikes the K was 10,000 won in 10,000 years when the block was thousand but was yearly data and the dirts constructed their diet dikes after the 1953 disaster with the Delta project as one in ten thousand year event that's a card of question so you bring it into the language of mathematics here it is you for this you use your asymptotic theory you can do because you know how to approximate the maximum here it is you take one minus one minus this function you know it explicitly you estimate the parameters so you put little hats on these things maximum likelihood and you've got a solution okay it's there go through it easily but this is really first-year probability calculus very easy okay and here is the estimator if you invert this is nice because I know the distribution age explicitly look like in the Gaussian case and so I can explicitly invert so this is your estimator you give me your que ésta mate the Chi the Sigma in the Mew this is your estimator and you can do because it's Maxim likelihood you can do asymptotic conference intervals but there is more to that story than I can tell you now but you can also look at the frequency on event of an event with with describe size it's called the return period problem this return level problem and again you can do the same I leave that to you we might have an example later but this is the kind of questions you can answer or you can look at the bunch the probability that next year's next year's if I say next blocks the first future block because yearly blocking next year's is a record that's a statement of Maxima so you can all work this out you know the explicit distribution in the limit you can invert it if necessary estimate all the numbers you get yep like I forgot now no okay okay this is a big in German side assistant clamour it's a big big now I think the original models they came up with what it was and a gamble model regional model of a size zero the original data presupposes size zero then later of course in the Indies okay because a ten thousand year event the dutch of course are very and it's just it much more than just estimating iid data its first of all what is a dark-eyed what is the what all your data there's a sea search and all that they use extreme value field they knew they would not be able to get reasonable answers based on the Gaussian so that took at least Maxima approximation and pretty sure the original estimate works I 0 I can look it up it has now been updated quite a bit also with the greenhouse effect but it was a combination I think it's an excellent combination that may I take the point there and we can discussing the break more about it well how you seek size zero I don't know it's that's a good point that's a good point now now we can enter in a really understand Michael this is a no no this is like the dislike the statement we have in extreme value theory because of data we often have infinite mean models surely find out everything is finite mean surely there's an upper limit but are you happy to give me the upper limit is it the C searches is three meters a see search is five meters the see search is ten meters like in in the the sendai sendai flood the problem you know no that's the whole point once you start setting your little limit it really influences a lot of D so I'm K and I personally would not put an immediately appropriately you can say the limit is 10,000 meters which I'm pretty sure is correct that is not far away from having an upper ended it I understand the question can we come back to that perhaps also in the break it's a very important question it's a very important point I think we should continue we should continue because we have no microphone there now so it's a very important point let me make one one bracket and I am happy you were not allowed to teach anything but no no no I had this discussion very often on the upper limit it was a big discussion operational risk and now make it big because it's a very important question operational risk in the beginning of the same question surely it's limited if you have a bank X and I can again use a name for X surely the the limit is let's say the total balance sheet or whatever x 10 even whatever or the wealth of the world there's an upper limit yeah but now the question is what upper limit till you put in your data I said well let's do a drop a limit of 10 billion for Bank a and I know the bank's I did it early on with extreme value theory it turns out that where you put this upper limit is highly determining the estimation procedure I personally would know do it or I have to be have to really have a god-given upper limit by construction of a derivative or so that I know that's a limit the same thing is true if there's a big discussion if you look at the model if Phi is greater than 1 then the most an infinite mean now you can say well infinite mean how can you do an insurance infinite mean I can tell you there's a lot of models out there data that point statistically to infinite mean I'll just run the statistics also operational risk I'm not saying that I believe that the best model operational risk is an infinite mean model I'm not saying that in the premium I'm saying if your data in a careful statistical this points to an infinite mean model then you better take a step back and then you say if this is the portfolio I'm looking at if these are the data I'm working with I should be very careful by automatic pushing and not a button and say this or the estimate this is your value at which pieces you will not expected shortfall doesn't exist and the same question you'll find in environmental statistics environmental pollution take Exxon Valdez and the later events in the Gulf of Mexico many of these data take at nuclear risk now now I my experts are sitting in the back nuclear risks the one or excise typically point seven is infinite mean one oversee point seven so size greater than one you should really think carefully risk managing that and there is a Marty Wiseman in Harvard he wrote an interesting discussion but it's hotly contested by economists you know the economists are very good thing in really attacking each other it's called the dismal theorem write it down look it up look at the discussion it's all about the questions you mention here and the main point is why do we get these situations like like an environmental risk like at a strophe risk like like many examples are nuclear risk these very heavy heavy tailed models extremely heavy tailed infinite mean in front of a point you really step back and they don't have the classical solution at some point you would even say is this the right way ahead with society to go into development that create these kinds of risks that may not be anymore be insurable you see it's a very big societal question the Dutch solve that brilliantly for there but I must continue so but it's a very important question okay so I'll leave this example you can the point of the X but let me give you the story Alex wrote is exact this this example I still remember very well we're still working at that at risk lab he wrote the example as a pedagogical they'll say you take a risk manager sitting in his or her office it was Alex so in his office and he's looking at the screen is hello by the way it's it's Friday Friday the 16th of October and my god this has been a very turbulent week today the SMP dropped by five etcetra percent the loss has dropped that week was nine etc this was the largest largest drop since 1962 of the SNP what happened in 1962 in what what was the world facing in 1962 missiles it was Kennedy the the Missile Crisis with Cuba so this was a very hot week what's going to happen next Monday I'm not going to say I'll predict for you exactly what will happen on Monday but let me see what the data tells me this example is about that it uses them the notation I introduced I will just quickly give you the gist of their story we do a point estimate based on our data block Maxima for next Monday what's coming out is a four point something estimate of a return drop down that's okay what happened on Monday the number is there somewhere on Monday we had a drop of over 20 percent the famous Black Monday now you can say oh my god extreme value theory is not it's not working you said four or five percent it was twenty percent wait wait wait this is statistics I give you the conference intervals if you now believe in go see Anna T you say well you know it's it's four point seven percent plus minus two percent you're still far off if you now do extreme value theory based we'd like your theory conference intervals you find the conference interval going from perhaps order of three to about twenty that's extreme value theory and now I'm saying to you that shows you that extreme value theory opens your eyes and searches is a damn difficult question but at least the best of my statistical know there is non-statistical note because we knew what was happening in portfolio insurance on Black Monday well the week before they were all running for the same out exit door we say well and the confidence level it allows us up going up to 20% and then we're not far off you see extreme value theory is able to catch that that somebody was a this is a once in a lifetime of the world event and again I know several CEOs of big American companies who said that it's wrong with the right move of this may be a 15 year event or 2,000 year event which is quite different that's on that slide and this uses this theory of block Maxima I must move on threshold exceedances rather than looking at the picture I saw before I can look at this picture these are the same data but I don't use block size this is statistically the more relevant method now I look at a high level still iid data I'm sorry I'm I'm sort of fast-forwarding in in your slides but I will fast backward later so now these are the same data but I don't look at the Maxima Maxima Maxima Maxima Maxima no I said let's move up in the data and look at the excesses exceedences over a high threshold I always called exceedance the points in time where it happens and excesses is the size above it which happens ok surely extreme value theory you want to understand that here and already more you said yesterday well if your IR D data these points when this happens if you go high enough it's personal that's famous person's theorem of rare events a key theorem mathematics gives you that's much more complicated the convergence of this whole process dynamically in time and space that's a famous Ledbetter theorem we don't even treat it in the book but we can do it that's a power horse of extreme value theory so rather than looking at blocking which is natural in some context we look at high and try to understand what's happening above this level these red Peaks can you describe them yes we can famous theorem here it is even in a very applauded course I'd like to write down theorems correctly it may be very annoying but what this theorem tells you without I haven't introduced an imitation it tells you if you look at the distribution function of your loss conditional on having a high loss so this is a distribution function of the rick excesses okay this is a distribution function of these things here so you're given you have an excess how high is the excess so you give it a distribution function of these conditional losses condition to be above you this theorem tells you that sorry I'm now getting confused myself this theorem tells you that there is a sorry sorry sorry this still this theorem tells you that there is again something about if pocket I can give you an explicit distribution function that approximate the red sizes that is it well that's great because that's exactly what I want and this distribution function is written in this slide here so it's an explicit distribution the generalized Pareto is to the GPD and now I know of thousands of papers in finance GPD GPD everywhere why if you have iid data and you're interested in these conditional excesses the red Peaks their distribution the limit is a GPD nothing to do with the Gaussian it's not in my right pocket in my left pocket it's a GPD and now of course you see well now I can do maximum like with the theory again I can build up the whole thing but let's revisit the theorem said it's a theorem the theorem tells us that this the distribution of their little red Peaks is conditional distribution above this higher level if the level goes to infinity or to the upper endpoint aha so I better go higher and higher in the data this is GP D and then we get something nicely mathematically thrown in that it's uniformly but that you get as a present that is nothing special here so now if you look at the world and this current statistics of extremes looks at this approach current statistics of extremes and looks at this why because the blocking is not really very natural in many applications so you look at your data and you look at the condition of the extremes and your model conditional extremes and now I go back well let's not write down the definition it's here so you're interested in the common isn't the reason to read distribution we should have used the color red here given sorry given that you have a loss above the level you what's the distribution of the XS so the loss mine is the level this is the red peak so that's exactly the thing that appears in the theorem the theorem is pickins bulk a matter how pickups on the one hat right and who's Paul comma in Louden's the harm on the other side they prove the theorem independently in the 80s and of course this distribution I can easily write down I can link it to what we now call them the expected shortfall careful but take a look at the expectation if it exists of this distribution so that the conditional loss given you for high loss expect the expected shortfall this U is there or you just add it up this often is also difference in thinking between finance and reinsurance really sure if you are normally really sure of it there's reinsure here's I don't know I'm excessive loss I'm interested that my my my risk Pierce's the new level and our interest in a fair premium which was the expectation of the excess loss that's what our reinsurance interested in in finance if you don't have anything derivatives there or insured air I think you'd bear the whole loss so in a bit of thinking differently so without to you it's expected shortfall and you see expected shortfall is still naturally there that's why from the beginning well before many of the papers we all did expected shortfall in all our courses and I still remember giving a talk to some banks in Surrey some famous banks in Zurich on extremely said why don't you look at these kinds of things this was well before you expected shortfall discussion it's so natural she expected size above these levels okay and now you can work out different things it's artistic so I will not do that it's not going to help you this works you do the statistics again you can lie down the log-likelihood you look at your Rick TX now you don't look at your red dots the Maxima you look now at the red Peaks the red pegs in the limit again properly normalized I wiped some conditions on the top onto the carpet because there's this betta you hear there's a scaling parameter but never mind and by the way sorry this holds if and only if this condition again holds so the two theories Fisher dipp'd block Maxima peaks of a threshold they have the same mathematical conditions it's the same way of looking at the problem mathematically but very different from an applied point of view current applications of extreme value theory are completely in this field it's all about peaks of a threshold it allows you much more flexibility you can make this level your time-dependent you can do make your parameter stand depend you can do that also in the block Maxima it's much less natural so this is the current approach from most of the application of extreme values here so you look at your data you choose you you how well that's it's just choosing to you because it's a limit theorem you go to infinity if it's an XF infinity where do you choose uu is as difficult as how many blocks you take is exactly the same mathematically there is no standard solution whatever people tell you some papers papers claim they solved it bootstrapping resampling it can help you but there is no standard really solution to that it's it's your decision in the end okay let's not you can write down the likelihood so you look at your excesses over this threshold the number of accessories you root and write down the likelihood you estimate you use a lemma there's an interesting lemma here and again I will say briefly if you have exact burrito data exact power tale data then this mean exists function this expectation of the lost size given your high loss is a linear function the GPD has that problem excite positive and even if it's linear from from from a certain point onwards it's a limit result it stays linear from that that's written down in this lemma I'll skip the lemma now it's very important let's look at an example how do you estimate so this mean exists the expected excess loss size the expectation of the red part given that you exceed that's an important function clearly is an important function right it would be the fair premium in excess of lots reinsurance treaty that must be pretty important right how do you estimate it trivially how do I estimate that well I look at my data let's look at the data here I'll look at the number so I look at the number these are sunday successes and I divide by the number of excesses that's all the empirical estimator all right I've got capital and you of excesses that's a number of excesses above you it's random depends on your data and then I average this red part the easiest of the easiest well here it is that's your estimator so if you want to estimate at the level V you look at the excesses above V V is typically larger than your initial value you only count the red parts and you divide by the number of red parts that is the indicator function and this function now you plot as a function of the order statistic now your plot is firms of course this function you can only plot if you have data so you look at your data order the data and then calculate this function okay if this function is linear from some point onwards we feel excited if we feel that we are not far away from a fresh air and because in the fresh air the Pareto generalized burrito it really is linear I know I'm going very fast but I really want you to understand the gist of it I said in nineteen in 1995 we gave a whole week of course on this ok but it's the same problem as finding the block size do you I can help you in starting the procedure so here is an example I think I'm ok in time isn't it so I still do this and then you take over alright it's a famous data set the Danish fire insurance data because I think it's much more important if I talk to you about data but I hope that you understood first a couple of things if you're happy with using the Gaussian distribution for solving problems in statistics related to averages and I think you all happy to be doing that with finite variance then you must be happy that's a causal implication by using the GPD for iid data if you're interested in excesses over high threshold all of us have got a serious problem as a mathematician and understand how you think what I may not agree on or you notan that the kind of questions you ask for sons are perhaps easier to swallow whereas the kind of questions you want to ask about extremes extrapolating beyond the data because now I certainly have a model that allows me to extrapolate anywhere that society is often using this theory a bit too quickly thinking it will solve its problems according to extremes in the whole of the world is unfortunately at all levels about thinking about the extremes that's why I say it's it's financed I would say it's society political science 101 we must be able to understand what it means that looking at extremes what kind of behavior they have if you take away that from the first part I'm happy and you take away there is theory at the level of the central limit theorem the same thing that helps here are some data you know about mm this is real data it's already in our yellow book this is fire Danish fire loss insurance data its losses in Danish kroner it's a whole dataset but it's a wonderful date I said losses due losses to industrial dwellings and we only look at okay that these are the data they're all discounted there in Danish kroner you see it's all data and now they're all put back to deflated in those case those days inflation still mattered they're deflated to 85 prices here are the data these are the data and these are typical data if you look at log returns later there's a typical data well I think I better look at extreme value theory because 2000 more than 2000 hundred observations so most of these well I would say non-action you hear nothing much happening but suddenly you get these huge losses we have analyzed the data reanalyzed Alex did a lot of work on this data the data in the recent version or I think on your I think it's even on our repository is indeed well alex has all the data on this website it's quiet well it's available it's public data now it became a very famous data set I think it's as famous as say what's the most important data in multivariate statistics every course uses them in that data set or let's say in discriminant analysis what's the most famous data in discriminant analysis goes back to Fisher the Fisher data Wow the Fisher iris data iris atossa iris versicolor aris virginica dean standard data set in discriminate analysis linear nonlinear this is becoming not that level but I think the Danish fire data has become a standard data set at extreme values here it is here is the Midd access function you can all do it you can start doing it by hand of course you need a computer to really do it fully so I look at the level level 10 and look at the level 10 I Everage all the losses above level 10 and I divide by the number of losses that's the value at level 10 17 African are not important so you can draw this picture here it is first thing the picture is increasing I'm going back to your question Michael this is increasing what does it mean for a mean excess plot to be increasing so the XP is an estimate Rita is exactly increasing the expected excess loss both increasing levels is increasing now let that go through your mind early even early in the morning the higher you go in your data the higher the excess losses are in the mean that keep on growing with it now you understand gradually what it means we're in a very dangerous world here this kind of data and this comes back slightly to your question why cause it's a data if I I have it in my book on extreme values clearly a nice data set in that book on insurance shows that it's not in create at some point these points are going down that's a typical example where the gaps are getting smaller smaller but here I'm I don't see that all right so this is pretty parameter of course it's not fully Pareto because there's curvature here now here a perhaps 2000 observation so every curvature you'll see is significant you can do regression testing whatever it's significant so here I'm not yet in the limiting attraction of my model the pair of the fresher I'm not there yet because it's not linear so I would presume it from here these are always all over the place because they're only few data port if my level is here I only average over one point so you get a lot of variation but this year so where would I put now my my initial view on let's say around ten I think I've got this initial curve and then I think this is pretty linear by the way that's eyeball expertise and the point I make there is no standard solution you can bring to this we can only help you in deciding where to start an analysis but I surely would do the analysis and redo the analysis at various other ports this is the fit to my conditional distribution but this is also the fit to the real distribution not lower constant so you see my feet really goes nicely into the it curves nicely into the tail but the little point here is I fit all the above ten I took you equals ten in the beginning I got a lot of discussions and people say well why do you fit if I if I look at my feet below here it's horrible so I'm not interested I'm not interested in fitting the data in the center I can do that I can just check a smooth the histogram I in some case of course you want to model that fits from zero to infinity I only fit the tail here that's important and now of course I can do whatever I like let's let me move from phrases I'm interesting the tail of the distribution well the tail of the distribution function that's a definition I rewrite the definition now in terms of the conditional distribution that's trivial algebra I subtract to you and you see I get the tail of the data so the probability of a loss larger than X is the probability of a loss larger than u x given and I'm larger than you and I'm larger than X minus U it's a one line algebra every detail is there this is first year probability not even that I think and now I said how do I get an estimate of the tail well here you put heads everywhere this one here I the best thing I can do is put here and you over you know we haven't written it down so I would estimate this by the number of excesses but divided by little n nothing much I can do if I have seven accesses out of 12 is 7 over 12 here I estimated so IC o---- AG if exactly this calculation typically halts all in the limit but its exact III assume I'm really in the limit more this I can estimate if you is large enough I know it's GPD exact here or in asymptotically I've got to limit so you see immediately you get a few interests in the tail you get the tail why would you be interesting the tail because you want value at risk and expected shortfalls your inverted there are the formulas these are the exact formulas for the expected shortfall in the value at risk in the exact model in the limit or you put twiggles here in your synthetics now awful large enough I remember I showed this to Wall Street as a general point in the in the in the late 90s as I said I said these are more relevant estimates of value at risk if you want to use value at risk I mean I even told them these days do you really want to use that they said nobody is ever going to buy that I mean I don't be buying putting money on the table but no perhaps also but they're never accepted why how do I interpret this if I go to a trader you're a trader so what's your sigh what's your beta think back of the Gaussian model remember the formula yesterday the value at risk is the mean which is very difficult to estimate mean return plus the sigma v minus inverse alpha remember the Sigma the volatility if I ask you your volatility you know because that's your heartbeat sorry as a trader the more girls like that the more you've proms with your heart so you're no developed early here none of these parameter you understand I'm sorry but this is the real world I'm not talking about the fantasy world of Gaussian T this is reality and of course your estimates are very different and certainly the XYZ CEO statement on did it way beyond the lifetime of the universe event in 1987 crash or the the the the financial crisis credit crisis and how in these models if you use the right parameterization just as a thinking you come to models like one in 50 100 one in 200 years that should concern every CEO okay so that's what extreme value theorem and this we can summarize we would like to summarize that on the other slides should I continue you're trying to chicken out sorry can you just erase that statement no okay but you all should see you're sure I would you want me to continue without this without slide so no no I don't have too much I think what okay anyhow the point is you should be able by hand that's the good news of EBT all these limiting bubbles are explicit in the distribution it's not so density an integral it's a distribution you can invert these so these problems you can all write down by hand if you go back tonight and say oh by the way I want to find this formula it's there you can calculate it by hand okay so you can do all the estimation so this is now what I said to Wall Street this would be your estimator so you replace this by nu over n the number of excesses divided by n and in the GPD you now do maximum likelihood estimation which works okay if X is greater minus 1/2 okay and now you can do what we call profile likelihood because you're interested in likelihood functions of this complicated combination of X I and the Vita's there's a standard technique called profile likelihood you I could do that on should I continue when should I stop I in two slides okay so well here is now the analysis of the GPD analysis of the Danish data first of all we go to lock lock plots why do I go to lock lock plots if the tale is X to the minus alpha the love of the tail is minus alpha log X so the log log plot is linear in alpha so do a lock lock plot and now you'll see a linear line sloping line it's pretty perrito by the way I said this tissue famous a decision unfortunately he died he wants to make a statement to me listen there lies there damned lies and no no clock pops the other one of course it's also by dividing statistics but it wasn't true that's that's the real famous quote careful with lock lock Lots there turn everything burrito and I know some people that claim everything is perrito but still if you don't normal data it curves like that so here I do now these are above 10 D so my is my histogram these are my data the black spots put on this new scale this is my feet and you will say well that's a damn good fit that's not bad I'm really going I captured the data the black line is my my model it curves solely slightly it's a wonderful model that really sure if the if the Gaussian said I go back to my CEO of Bank XYZ he would have this model here and any of this observation it's impossible of course it was possible because it happened but only once in the lifetime of universe here it's pretty acceptable now comes the dangerous thing because now we're becoming audacious now we can say well Wow well that's this is log scale I want the one in 10,000 different the one in hundred thousand different okay you get it whatever you want I'll give it to you well now you look at conference intervals that's where the profile like it was important it should be there nice not there okay it's there but yeah can you see it it's on your look carefully that that's the same data now I estimate the value at risk at the level let's say 95 percent I think this is the estimate it's about let's say about 40 here that's my value at risk you see this very faint curve here this is a with Alex crafted this plot in the early days on the right hand side this is likelihood theory the 95 percent conference interval you can draw this line you draw the local the so-called profile likelihood whatever at standard statistics you cut the line it looks like a parabola and this gives you the conference in two Falls the conference interval's will be skew and heavy-tailed it's Q and no it's rather wide because it's logarithmic but let's look at it affected short fall the Chi was a bit sex I here didn't mention it stories about tree it's in the range of tree the tale the one of X I is one over three the tale is like X to the minus three like most of the standard data infinite infinite fourth moment finite third moment now let look at this curve here that's the conference interval for the expected shortfall the expected shortfall is estimated for this data is 58 million Danish kroner with a 95% conference interval going from 14 to 155 so you go now to your manager and say oh by the way our expected shortfall the new famous risk measure is about 60 million but by the way my 95 percent confidence interval is forty two hundred and fifty five guess what will happen right again that's exactly extreme value theory it tells you even for the nicest of the nicest data I know it's real data that extreme value theory opens your eye for estimating far in the tail but it gives you also a feeling from how uncertain these statements are and you need to bring order thinking into it the Dutch dykes builders brought of course of the thinking's of it my engineering friends of course bring expert expert advice in the in the in the field of engineering and depending on the model they use are my two it just shows you this extreme let's say if I were to move I'm here still well in the data if I want to move here this aim from in intervals are all over the place to infinity nearly might even diverge that's the reality I hope you can still do a little bit on with I've done know I'll leave the time to the moment you can do the extreme value theory also for time series Archon GARCH models but for very few of them okay so I just want to quickly point out some our scripts related to EBT there is one around this Black Monday event that essentially describes the blog Maxima method all described and and then there's one that show that reproduces the blood consuming the Danish fire insurance data set what I would like to start with you it's probably the only one we discuss is actually something more theoretical maybe you saw the big results you saw in the very beginning the strong law of large numbers you saw the central limit theorem so you're right pocket you saw that Phil Tippett theorem and the pecans became a de Haan so I would like to at least convince you graphically in our that these results hold her these theorems are true now as Paul mentioned the proofs are non-trivial but maybe we can simulate a little bit play around with it in our and see whether they work okay good so I simulate 50,000 observations of my favorite distribution the pocket to distribution this is now as Frank mentioned yesterday this is just one large sequence fifty thousand observations the defeat our parameter is three a path purgatory yeah yeah yes yeah like two Danish fire insurance they do good but I then built out the cumulative sums so the sum of all the first well the first observation feels to first three and so on and I divide by the number of observations so I built averages over the data over more and more data and of course by the strong law I should see I should see you conversions here but this is the convergence again in log scale towards the mean now so that's that's your strong long that seems seems to hold fairly fairly nicely yeah SN over N as a functional areas now for the central limit theorem I want to block that huge amount of data into different parts and then standardize each part right location scale transform and blow up is square root of N and then I have realizations that shoot together follow a normal that's the central limit theorem so I take five hundred blocks and I split the data I compute the true mean and the two variants in this case and then I take the sum here and I locations gate transform as was in the first slide and chapter 5 now and then I can look at for example a histogram so here you have a histogram of these of these allocation scale transformed and blown up with square of n averages and I overlay that with an density estimate contains the estimate and then I also overlaid with the limiting normal distribution now stand out normally and that seems to hold quite nicely to now so based on packet or data properly location scale transformed sums follow the normal right good so nothing really new here but then I can look at me vanquish film and I can now block in exactly the same way I always use the very same data set no different data set now always the 50,000th I start exactly the same thing but now I use these blocks and instead of building the sums I use Maxima so I built clockwise the Maxima and then I do location scale transforms of the CN and DN you have in the slides and the denko's theorem is very nice because it also tells me how to choose the CN n TM there's this one example that Paul gave you as an exercise to look through the interesting part about that example is that the CN n DN is not as in the bankers film yet you get in the limit the standardized form of the generalized extreme value distribution of G V now with the normalizing constant you get from the denko's film are very general you can apply them to any F you start with but in the limit you might not get the standardized form of the GB and this is also what we will see here now because if I don't know anything about the distribution or I don't have an educated guess as as Paul what the normalizing sequences are my best guess is to look at they then go and say AHA my CN is essentially the point head of 1 minus 1 over N and my DN I can take is 0 and those are the normalizing sequences I take heel so I subtract the 0 to the N and it divided by the the true quantile function 1 minus 1 over m n essentially so this is how I built my location scale transform data this time so exactly the same data just location scale transformed maxima instead of sums and obviously differently locations k transform and I should get a GV and then this plot looks like this so I've oval I have a histogram again I have a density estimate and after limiting that corresponding limiting GEB density overlays it's a bit more trickier when my dog you one would like to see this maybe in log scale also I tried this but it's it's not as easy because of the histogram and and also that the entities can hardly be negative so that's a bit of numerical issues here but overall this looks quite quite nice I guess and then there I can also look at the QQ plot for example of these yeah of these locations gay transform data versus the quantiles of my limiting Chi beam so I also see something about the video this looks like that also looks actually very nice huh we had a lot of discussions about QQ plots and we showed also a lot of QQ plots since yesterday some remark here it's always a bit difficult to interpret QQ plots with our confidence bands if you compare 2 QQ plots based on different sample sizes it's not as easy to to discuss where the fit is bubble because you don't have the notion of confidence intervals and that's always important too so there's actually an hour package then can do that it's a QQ test package it's just a bit more complicated to use at least at the moment so but that would also do the bootstrap we discussed yesterday and give you bootstrap confidence intervals and that's been easier to interpret QQ plots just for for those that had questions about you bluebloods yesterday good but then that was that was a block maximum method right I clogged up my data essentially along the x-axis and I pick out the maximum out of each and then properly location scale transform and and that that gives me my Meiji lead distribution but now I do the blocking or not really blocking but now I could do that along the y-axis as as Paul mentioned so so I look at the threshold and I consider all losses to be large or important that are above the threshold so that then we on the second approach to to EBT so in the peaks of a threshold method and as a rule of thumb one but this is very dangerous to say actually as a rule of thumb when typically takes like the 90% quite have we had operational risk data where we had to use the 50% won't have because you have too few excesses so there's a lot of a lot of things going on I use heel just as a rule of thumb to 90 percent quantile of my data and then I grab out all of my losses I simulated from the Pareto that exceed that threshold but very careful here this is what students typically do wrong they typically start a stop at this point they look at the exceedences so they look at those losses that exceed the threshold but of course those losses come from a certain conditional distribution that you from the original conditional distribution that you exceed the threshold but then of course they're still related to the original distribution you start from now simply take the threshold to be zero then those simply come from the original distribution you have to subtract the threshold you have to look at the excesses so be very careful in the language here exceedances are those losses the whole distance so to say of the whole losses that exceed you and the excesses of the exceedances subtracted by the young you subtract the threshold so this is important here otherwise I don't end up with it with the limiting distribution good so I look at the excesses I try to stay close to the notation on the slide so it's easier for you to follow me we had Y there for example and then I know those should be a CPD distributor so I do the same exercise as before and it's getting a bit more complicated to look at the histogram but still this is this is your histogram the overlay density estimate and the limiting Chi PD density it's not a very good plot so yeah what you could do is essentially histogram is improvements approximation to a density estimate anyways if you made the histogram you can do a little bit of log log scale here but then of course actually you would need to check yeah you would need to discretize a bit more over here to you have like a nice the blood towards the small values but yeah that's essentially something I had together yesterday evening so just for you to see and play around with it yes you mean this one yeah yeah you could yeah you could do that yeah yeah but I think it also tells I mean a little bit of the story to see but yeah of course you can simply grab our doors that are in between a certain range yeah yeah true yes yeah and then of course you can also look at a QQ plot here versus the GPD distributions oh yeah and you see the beauty is really essentially again looking at graphics trying to understand and learn about the theory from graphics and yeah you see you can do these big limiting theorems for big limiting limiting theorems in probability just by playing a little bit around in in our so that's yeah that's all very nice good there are two minutes left the question is what can be doing two minutes let me just quickly walk through reproducing some of the plots Paul showed you concerning the Danish fire insurance data set so here you see the data you've seen on the on the slides you can use the mean excess plot so the sample mean excess function essentially to determine that threshold now and you know for the theoretical limiting GPD distribution you know that blood should become linear ideally at some point and two possible thresholds to look at here are between 10 and 20 now so you can argue here's a little bit of a kink but then afterwards either from here or essentially from here it mostly looks linear and yeah I think the biggest message to given the remaining one minute is that stressing again what Paul said you really need to do the exercise for several thresholds never stop at one threshold and so you want to make sure your estimates of say for example be stable in the threshold choice completely robust and misspecified the the threshold if you like yeah so you have to redo that exercise we also that that especially for operational risk we looked at a whole bunch of different thresholds and that's the plot of the empirical excess distribution function and versus the theoretical wonderful radical one obviously is your TPD limiting distribution they are still much more to come in cure and tools about that Andy bTW we didn't catch up to match by now but yeah so those are the types of blots you get at the moment and here you see the bit battle and then on the slides before so this is the plot where you have the risk measure estimate here in this case here let's the expected shortfall yes at the 99% confidence though the confidence level ends here you get the levels of the confidence interval so for 95% you see it's already already quite quite quite large [Music] when these tools were introduced and operational risk was discussed of course one in thousand year estimate for operational risk this must be extreme value theory it's exactly made for that so they invited us Valerie she Avastin will learn from only loss Allen and your hand initially her votes now would be guild we thought the whole week the yellow book which is very mathematical at the Federal Reserve in Boston and in the end of the week talking with my colleagues or the Federal Reserve about 15 excellent people from all over this Fed system in the US it was basically conclusion that extreme value theory is not going to solve the problem the data are just not of the quality that you can apply the standard methodology and I think it was very positive in that point of view that the extreme value theory pointed like here also to this is the end of the line of the kind of questions you can answer I'm not saying that the operational risk is not important but you seem bringing other economic thinking discussions you look at the three solvency test under this resolve in C tester I can look at my colleagues we don't have a pillar one estimate for soldiers for operational risk on the boughs of two we had lost distribution approach with a very explicit value at risk estimate at a 99.9 percent level now it's Kemeny scaling back I'm not saying that these methods and that's also where our research is relevant I think should not be used there are still something like Orissa Oh internal risk assessment exercise for company you of course there you should use these methodology in learning about your data where I have a problem where we have a problem is how do you map the data experience operational risk internal external expert opinion to capital there's a lot of uncertainty whether the standard model is going to solve that the way streeted now I'm not at all certain I'm not totally and happy about that but that's something else but I think I really want you to walk away with the correct understanding what if ET can do I hope you've seen that we can really solve interesting type of questions that the type of questions you asked yourself in the world of extremes with a little bit of mathematical notation there is no deep theory here there's no Ito calculus whatever here is absolutely first-year probability theory central limit theorem stuff you can use the mathematics to reformulate the questions you ask him in a bit of language of extremes you use limiting result estimate and start thinking but don't walk about the estimate to say well this point estimate let's say of 58 that's it forget about all the rest no no no that curve will now see much better is there and that tells you the statistical uncertainty and I'm not even entering model uncertainty because these data are the nicest of the nicest real data I think we have a we are done break so we start again so I I just briefly skipped one little session because I only discuss in this lecture it is one in a half hour leave unless summary of extremes more stressing the big guns of the field bulk matter harm Pickens physically linked tippet and all that only look at iid data or discussing of course if you say well if now have dependent data there is a theory for dependent data if I've got non stationary data you can also do it especially peaks of a threshold if you have stationary non I ID'd dynamically like in Alec's case you can do it the standard statement I would make if there's anything of the non-ai deep heart in your data stationarity stationary but not iíd or you have non stationary data first filter the data so fit the non stationary component look at residuals if there are iid if you have done a good fit good filter then you may want to use EVT at that level and then transport back through your filter this is exactly what it's done if you look in time series analysis you know now we have the time series setup that Alex was looking at and now we talk of conditional EVT so you have now let's say I only do this briefly but just to tell you it works so it's so we have our our gods type process depends of course on the specification of ZT in general you can find the kind of recursive equation is well recursive but not recursive little bit linking the value at risk in your dynamic is a dynamic value at risk now but linking it to the innovation which I assume here to be iid so this is now the guard filter so you think that your model is so that once you've had this fit here this is your guard fitting that takes away the norm iid knows you look at the residuals by the way very often it's a stylized factor so that very often guard filtering gives you close to iid but still leaves extreme values in the tails the residuals so God is not really able to get the extremes in the tails away and hence extreme value Theory becomes relevant at that level so what you then do once you have India models this recursive equation or this equations to and you can start estimating extreme value theory that level estimate your guard parameters and then it go the world expert is Alex he wrote a very nice paper Alex and Ruettiger sitting here Alexandra Ditka wrote a very nice paper on AR 1.11 completely working out this procedure and it really became a very very well known paper where they did that okay so here you see exactly what I said Rita our regards if our regard to development model estimate from armored guard these protease parameters fit the GPT to the the iid disturb you just watch no T here this iid get the estimate for that and transfer it back through the equations this is not always easy to do I think in our one guard when one works if you go to higher dodge and correct me if I'm wrong Alex it doesn't work that easily but this is standard procedure so pre filter your data and then do the analysis I'm trying to bring it back to the system if of course you have very very clear day that comes in clusters which you can have then you may want to do cluster analysis or or this there are all sorts of techniques to go beyond iid but that I think I should have mentioned so now we go