Financial time series (QRM Chapter 4)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
okay so I think I will continue on every other day this week apart from today you only have an hour after the coffee break which is probably more bearable today we have in the program another hour and a half to make up for the the missing half hour this morning because the other piece of bad news is we start at 8:30 every other day this week we start at 8:30 but at least the final session will only be an hour from from Tuesday in the afternoons what I want to what I want to do today in the final session I want to talk a little bit more formally about time series some concepts that we should have in mind I want to then really spend some time with GARCH models because I'm looking for models that can reproduce some of the stylized facts and it turns out that the GARCH model is quite simple but it can reproduce most of those stylized facts that I listed for a one-dimensional time series so it's quite nice as a model and we will simulate a GARCH model we will fit a GARCH model to data and talk about its advantages and disadvantages and we will use it for volatility estimation and forecasting and talk about how that relates to simpler methods like exponentially weighted moving average or exponential smoothing volatility forecasting because for volatile financial time series if these are our risk factors and we're trying to model the changes in value for portfolio over time periods like a week a month we have to take into account the volatility the volatility clustering is still present and weekly and monthly returns and so it's a feature we should take care of so 4.1 fundamentals of time series analysis at this point the course looks a little bit more formal because we've reproduced some of the definitions from the booth but I think particularly because it's late in the afternoon we don't want to get too bogged down in formalities so I'll try try and give the sense of all of these definitions we'll spend about an hour sorry we've spent about half an hour and fundamentals before we get into Gaucho and then start simulating and estimating so R will reappear in approximately half an hour but to begin with some some of the preliminaries of time series analysis now actuaries generally have to learn some time series analysis I know certainly for the UK professional exams you need a course in time series analysis and I've taught it to undergraduates and postgraduates students and there's a lot about ARMA Pro there's a lot about stationarity and auto correlations and armored processes and a little bit about Gaucho and so I'll just do it short tour through some of that classical time series some of the ideas that we should at least have some understanding of to talk about modeling risk factors but I will then spend more time with with GARCH okay so what's the time series it's a it's a stochastic process beg your pardon so stochastic process in general is a family of random variables I think that appeared in Chapter one this morning with some index set and a time series in particular this index set is going to be natural numbers or integers for different time points with a sense of ordering in time so some of the things that appear in these slides we need to talk about the moments of time series time series have mean functions and they have auto covariance functions and until we've decided our time series is stationary to be general we have to first define the mean function as a function of time but then when when we start to talk about stationary we can simplify this note a so the main function is a function of time to begin with it is just the function it gives the expected values of the time series and the auto covariance function to begin with as a function of two times and it gives you the covariance between a random variable in this model at time T and a random variable at time s so T and s are two times and in general these times will be integers and so mu for the mean function and gamma for the covariance function so these are these are the first two moments of a time series now the models we use in practice whether ARMA models are GARCH models or GARCH models with armor errors are have to work with the right computer are generally stationary that we work with stationary models so this is a concept that for people who've never done a course in time series is one of the difficult ones to begin with but I think again a lot of you being trained actuaries will know some of this there are two concepts of stationary or two usual concepts of stationary that one looks at there is a concept of stationary which is defined entirely in terms of in terms of these two moments mu and gamma and then there's another concept which is about distributions more generally so a weekly stationary or a covariance stationary or a so-called second-order stationary time series you basically look for three things to begin with it should be a process XT where the second moment is finite or you can't you it can't be covariance stationary if it doesn't have a finite second moment so it's the first thing but more importantly the mean function should be a constant it should be unchanging and the auto covariance function should be invariant under under shifts under shifts in the time indexes so if I look at gamma TS and I shift the x so I shift T by an amount H and I shift s by an amount that should not change the autocovariance or the covariance between X T and X s should be the same as that between X T plus h and a X s plus h so these three things are satisfied it's called weakly stationary strictly stationary on the other hand is about distributions and strictly stationary means that if I take a series of n random variables in this time series so I take the random variables at times t1 t2 TN and I shift all of them again let's say I shift them by an amount H I don't change the multivariate distribution this little piece of notation which may crop up at various points in our book is equality and distribution so I don't change the joint distribution of these n random variables by by doing a simultaneous shift in time so this concept of a joint distribution if that's tricky that comes up tomorrow when we talk about multivariate models somehow it's all it's almost impossible to decide the correct order to do time series and multivariate models possibly the other order but but what's the essence of this both types of stationarity it's the idea that xt behaves the same in any time period so that that's the intuitive idea it behaves the same in any time period whether we observe it in the year 2016 or the year 2013 it behaves in some sense the same now this from an economic point of view this seems nonsensical because everything changes in markets and the volume of trades of markets also changes over time and the rules change the institutions change so everything changes so the idea that you would have stationary and the same behavior in any era is obviously difficult to defend certainly over longer periods but in order to make progress with modeling it's what we tend to assume at least over short periods maybe may you know maybe four years or something or a we might try and fit a stationary model of course we could have models where there are changes in regime change point models where you change from one station regime to another stationary regime there are other possibilities but in general to make progress we want to assume stationarity the two ideas of stationarity you would think that strict is stronger than weakly unfortunate it's not quite that simple to be pedantic strict does not imply weakly stationary if I if I just write stationary you can take it to mean weakly the strict does not imply strictly stationary does not imply Stacia unfortunately because we have to take care that the second moment exists and you can have a strictly stationary time series which is infinite variance but we don't need that for our purposes but you can have it the other way around you would expect that weak stationary doesn't imply strict because we only looks at the first two moments so third moments fourth moments these things are not considered now this line of equations let's just jump to the punch line if you have a stationary time series a stationary one in this first sense then the auto covariance function what you can show just by a little bit of manipulation is that in fact it only really depends on one thing which is the separation of these two times the so-called lag it only depends on the lag age which is the set and it's the absolute value of t minus s because it's an even function gamma H and gamma minus H are the same so for stationary the auto covariance function becomes a function of the lag and on these pictures of auto correlations we tended to on the x axis we used we had the lag so this is the lag H so the picture here tries to convey the idea could this be a stationary time series this is not financial I believe this is a know Y somewhere monthly carbon dioxide concentrations there are two reasons why this you'd be unlikely to model that with a stationary time series the first is there is a systematic growth in whatever we're measuring co2 concentration there's a systematic almost a deterministic growth you have a feeling that you could lay a some sort of exponential curve through this quite satisfactorily and predict the trend moreover there is a very strict periodicity there is an annual cycle in co2 concentration so there is another deterministic predictable element in this time series so whether you look at you know December observations in the 1950s and march observations in the 1980s it's clear that these are taken from different distributions the levels different and also the you're in a different part of the annual cycle so that's not stationary this is and I know that because it's actually a simulated realization from a stationary autoregressive process so whether I look at it here whether I look at it here it behaves you could say somewhat similarly it just looks like noise but it isn't in fact noise but it behaves similarly in this period to this period so once we've decided we have that we can model or once we've made the heroic assumption that we can model data with the stationary model we can describe there the correlations with a function of one argument H this is called the autocorrelation function and you get it from the previous one this was the auto covariance function gamma H you get it by dividing gamma H by gamma 0 and this function gives you the correlation between well X 0 and X H or XT and XT plus H or XT and XT minus H any two random variables separated by lag H this is the autocorrelation function this is what we've been attempting to estimate from data and we'll come to the estimator very shortly so what's noise noise informally is a process with an ACF which is always zero except when H is zero obviously just to step back when H is 0 then the ACF is 1 gamma 0 divided by gamma 0 but a white noise it's a process XT which is firstly stationary and its autocorrelation function is effectively the identity of the indicator the indicator of the event H equals 0 so this is just fancy mathematical notation for saying it's 0 everywhere apart from H is apart from when agent 0 ok and this is white noise so in words it's no serial correlation a white noise has no serial correlation and the beat because it's stationary it will have a constant mean and it will have a constant variance and let's suppose that the constant mean is 0 and the constant variance is Sigma squared then we would say that XT is a W and a white noise with mean 0 and variance Sigma squared this doesn't mean iid observations this doesn't mean iid observations for iid observations we have another name which is strict white noise so if I say strict white noise that does mean a sequence of iid random variables so if you may see WN you may see SW and S is for strict and again we can give the mean and the variance mu and Sigma squared in this case often often we assume that the mean is 0 so this is iid and this isn't now why is this important well in fact a GOC process under conditions is a white noise a god´s process is actually a white noise it's an uncorrelated process it has lots and lots of structure but it is a white noise to build that up in stages in fact there's another idea in the textbook so I'm doing the whole chapter in time Susan an hour and a half brothers there's a there's another idea in the time see in the in the textbook of noise and that is the idea of a martingale different sequence so some other notation that you're going to see actually quite often you're going to see these calligraphic F's which stand for Sigma algebra z' and they're again they are indexed by time and that when we have a sequence of them it's called a filtration but this is a mathematical model for the if you like the the evolution of the information about this time series or the information that we have so the filtration is the evolution of the information we have about the time series as time goes on we observe more and more observations we learn things and we collect information about the time series and that's called the filtration or to be more particular often we deal simply with the natural filtration which is the one where each Sigma algebra is generated by the time series up to time T so this really just describes the information that we collect as we observe the time series we see more and more observations and we collect this information and this information forms a natural filtration because there are other things we might need to know we might know we might be reading the newspapers so we get other kinds of information that is external to the time series and we can also put those into filtrations but let's not worry about that yeah so these script apps will just we'll call ft will often just call it the history up to time T it's sort of everything we know up to time T including the evolution of the process sometimes only mostly only the evolution of the process itself in credit risk we might put some other things in there that we know but ft is the history or the information up to time T ok now what's a martingale different sequence so XT is a martingale different sequence there are three things that have to be fulfilled it should be a process which where the expected value of XT is always finite for all T each of these random variables should be so-called measurable with respect to ft but just look at the orange bit this is the only important bit that this hour of the afternoon the orange bit says that the expected value of the time series given everything we know up to time T is 0 so if XT plus 1 is the next observation and ft is everything we know up to this point in time a martingale difference just says that the expected value of the next observation is the next random variable is going to be 0 and in fact God's processes are martingale different sequences they obey this condition here and you might think well what's this got to do with martingales and differences well there's the one-liner there this is the martingale property if XT were a martingale the expected value of XT plus 1 given F T would be X T if we difference that process that is if we look at epsilon T then epsilon T will be a martingale difference so that's why anyway this orange property is the only important thing on this slide for you for to retain because our God processes will have this property and you can show that if you have a process that has this property the M GDS moreover if you check that the variance is always finite constant and finite then it's a white noise and so what we'll find is God processes they have this property and if we take care of the parameter values they will have a finite variance and so there will be white noises I think that's as that's as much as I'll do there right so again in standard actuarial courses and and university courses and time series you spend many weeks studying ARMA autoregressive moving averages to be honest they're not particularly useful for modeling volatile financial returns but we can put them in there we can use them a little bit but they're not the most important thing so let's just make a few points about the next three or four slides and move on to God so this is the most general definition of ARMA you build it from white noise so let epsilon be a white noise so that is an uncorrelated process it could actually be a Gaucho process but we assume it's white noise and then XT has an armorer process if it if each XT satisfies equations of this form here so you have a linear combination of xt in previous values equals a linear combination of epsilon T in previous values and the number of exes that's P and a number of epsilon that you look at that's Q the armorer process and for many as this subject developed armored processes were used for modeling all kinds of phenomena and predicting all kinds of stationary phenomena but they're not much good for volatile financial risk factors or at least for the main features of volatile financial risk factors I will use the word innovations when I build a model like armor starting with noise I will call the noise innovations if you like these are the these at every time T you at every time t you get the volumes the process from past values of the process past values of the noise and one new innovation that's the new thing and if you like it represent to you here represents the news that uses is new let's forget about back shift operators and things there are compact ways of writing these models and you're involving operators and there is a theory which I would normally spend with undergraduates about five weeks doing this theory I'm trying to get through it in five minutes as you can see but because on the assumption that some of you will have seen it at least allows you to link your knowledge of time series to them to what will be a bit newer I hope we generally look only at causal armor processes causal armor processes in fact all can be represented as linear combinations of the noise variables they depend on past and present but not future noise process processes that that's what makes them causal and for these processes these causal armor processes you get a nice expression or a relatively nice expression for this autocorrelation function the ACF of a causal armor process you get a formula that you can calculate and this can have all kinds of behavior it can decay exponentially it can look like a decaying sine curve it can do various things but typically with the financial returns we won't observe the kind of behavior given by this autocorrelation function so let me I really want to get to four point one point three I have included one dense slide which sort of gives in a nutshell the theory of when is the armored process stationary and causal a very famous theory about when the armored process is stationary and causal but the only thing I think I want to tell you about on this slide before we move on is this point here which is that if you look at the orange sentences here an armored process can be written in this form we can be written as XT plus mu T plus an innovation epsilon team and everything in mutti is written here in this formula and the point is that mu T is the conditional mean of the process so Mew T is the conditional mean of the process I'm assuming here that the Epsilon's form a martingale difference sequence that's that's going to be fine if as long as xt is a so-called invertible process this mu t will be f t minus 1 measurable the important bit is that mu t is the conditional mean of the process a mutti is given by this so the way you understand armored processes is they give a structure to the conditional mean they give a structure to the conditional mean and so if you're wanting to predict the mean you use armored processes if you want to predict the conditional mean that is what is the expected value of XT given information up to time t minus 1 then you build armored processes but if I'm allowed to just jump back to my stylized facts really quickly I have this stylized fact here which says conditional expected returns are close to zero okay it is very difficult to predict a log return of a stock price or an index or an exchange rate it's very difficult often the best prediction is zero I know that because they have very little serial correlation there are very little serial correlation and you really need serial correlation to be able to make a prediction other than zero so to go back to the other processes there could be there could be a net small positive increase over time but yes but looky at the daily and weekly and looking back because we made some with our we made some pictures if you look at the actual estimates of the correlation between weak tea and weak T plus one these are negligibly small but yeah over time well it doesn't my opinion in my portfolio they don't always drift up over time but okay so an armorer process poot structure in the conditional mean but if you know if the if you're dealing with data where your best guess of tomorrow's return is close to zero you don't need a lot of armor structure for the conditional mean you can use it but what is going to be much more important is to model the conditional variance and so that's where God comes in what we're going to see is that a guard process is basically the most natural way of modeling the conditional variance that you can think of and so it is completely analogous to armor armor as a set of natural models for the conditional mean and gotcha is a set of natural models for the conditional variance and there are nice parallels between the two that I will that I will get into right so to finish this first little half hour when we make those pictures of course we plot estimates so we I told you there was a picture called the correlogram these are pictures of estimated serial correlations what do we actually do we have a sample of size n we have a sample of size N and we estimate the correlation at H by forming this ratio x-bar is a is the sample mean so we just compute this and that's our estimate of the auto correlation at lag H and it theorem 4.1 just says that it tends to be quite a good estimate so under some conditions if you have a causal process driven by strict white noise you can show that these estimates are these estimates are reasonable estimators of the true values and in fact the covariance matrix the asymptotic covariance matrix is known explicitly but this is the result on which the picture is based on which the pictures that we saw our base so it applies in the case where so a special case of this result is when you're dealing with strict white noise itself so when XT is strict white noise itself that is when XT is a process of iid variables we know how the estimated correlation should behave they should be asymptotically normal these estimates should be asymptotically normal and the covariance matrix should be the H dimensional identity and this is where we get the confidence intervals from basically our correlation estimate should lie between two normal percentiles with probability one minus alpha and this is the interval which is shown in the correlogram so it comes all from the statistical theory of how these estimators should behave and so we plot the correlogram we plot these estimates against we plot Rho hat of H against H except that we don't tend to draw points we tend to draw vertical bars and we call it the correlogram and we can supplement it with numerical tests there are a variety of tests usually involving box volume in which you take sums of squares of these estimates and compared against chi-squared distributions and these are these you use as formal tests of whether your data are independent so the null hypothesis here is that we have strict white noise the data is trick white noise and under that null hypothesis we should have normally distributed estimates ninety-five percent of them should lie between lie within an interval defined by these boundaries and if we sum up squares of them they should have chi-square distributions and if that's not the case then we reject the strict white noise hypothesis yeah chi-square so it depends how many you take so you might take H might be 10 so you might take the first 10 lags the estimates of the first 10 lags something like that and so there's one very important comment on the bottom here if you have a time series which is strict white noise and if you square it or indeed if you take the absolute values it's still strict white noise strict white noise is another word name for iid so if I square it or take the absolute values it's still strict white noise and it's never enough just to take the data draw the correlogram and do for example the test of Union box and then to say okay I don't see anything in the correlogram the test is not significant therefore I have a strict white noise you've always got to look at the absolutes and the squares when you're analyzing financial returns because you may find that although the raw data pass the absolute values and the squares fail completely to be to be consistent with strict white noise now it's going to be much better to do that with data so we will do that with data but when after we've fitted some GARCH models please yeah yeah I want me to say how big the data set should be yeah this depends on so many things that the nature of the underlying distribution is it light tailed or heavy tailed question is how fast is this convergence it's all based on the speed of this convergence to normal and that will depend a lot on what's the distribution of the underlying innovations when I am fitting models and doing some of these tests I'm usually working with Bowers of n which are multiple hundreds of data I in daily data I would quite like to years which is 500 trading days might prefer four years but as to you know how quickly do these results I know how quick is convergence that all depends on you know what's the nature of this strict white noise I haven't said that the Zed's are normal innovations they could be heavy-tailed innovations so it's yeah yeah more than ten I'd want more than a hundred as well yeah yeah I'd settle for more than ten thousand but yeah I didn't more than three or four what do you think Marisa it's a good common yeah the theory doesn't work so so Archon gosh there are actually very natural models and it's surprising that it took and probably until the 1980s for them to be invented because they are the sort of natural ways of modeling a changing conditional variance and okay it's a bit of a mouthful arches autoregressive conditional heteroscedasticity look at all of those words in turn and the G is just generalized but we usually start with arch and I like these models because they're quite simple models but they're going to be able to produce a lot of the stylized facts as we will see so angle came up with arch originally and so he said well let's have a process built like this so let ZT be strict white noise so here the building block the innovations are strict white noise that is this is some independent process could be normal doesn't have to be and we will set the mean to be 0 and the variance to be 1 so what happens in an arch process is you multiply the innovation ZT by Sigma T and Sigma T is given as basically a constant plus a linear sum of previous squared values of the process so this is a set of equations for all T and the coefficients here have to be non-negative so ZT could be normal as I've said it could be for example these are T's quite often what we will take is a student distribution there's a little problem with the student distribution in that in its standard form the variance is bigger than 1 so we we scale it so you might see this quite often this just means that we scale it so the variance is equal to 1 so that we have a strict white noise with variance 1 what Sigma T here the way we've defined this process Sigma T is the conditional standard deviation of X T so given the past values of the process Sigma T here will be the conditional standard deviation and Sigma T squared the conditional variance and you can see that how does this work if previous values of the process so large this will be large Sigma T will be large and therefore the standard deviation of XT will be large or the conditional standard deviation some of these remarks here arch is a martingale different sequence the first thing one can verify and I'm not going to go through the mathematics because obviously it would be sort of futile we wouldn't get anywhere if I go through all of these equations but let me pick out and mostly it's in color let me pick out the key ideas an arch is a martingale difference it has this property that the expected value of XT plus 1 given the history that is given all the variables up to time T is 0 and I've said that this is a sort of stylized fact of financial time series that your best guess of tomorrow's return is is that it's 0 because you don't know if it's going to what go up or go down so it's a martingale difference moreover if it's stationary if it's station let's assume it's stationary later on we give the condition let's assume it's stationary then the variance the conditional variance will be Sigma T squared so the conditional mean will be 0 the conditional variance will be Sigma T squared and we call Sigma T the volatility so finally I will identify the concept of volatility with something mathematical I'll identify it with Sigma T it's changing in time it depends on past values and in this way I can get the volatility clustering because if one of the previous values is large or several of them are large but at least one of them is law then XT will tend to be large and this is what makes it arch auto regressive because XT kind of depends on previous values like an auto regression conditionally heteroscedastic means a conditionally changing variance or a changing conditional variance something like that but we're going to deal with that in a different way so we're going to deal with that in a different way we're not gonna we're not gonna change the centering of this noise what we're gonna do to model that phenomena is we're going to add something we're gonna add a mutti we're gonna add a mute E and we're gonna give that ARMA structure so we'll have a bit of ARMA structure even though we did snort like let's say a first-order effect or first-order importance now the book takes you through the properties of arch and if it's a nice theory to teach because you can calculate lots of things you can work out the conditions for when the arch process is both strictly stationary and weakly stationary they differ and moreover what you so the the blue points sort of summarize it you get a condition for strictly stationary what you also find is that you get a heavy tail process even if even if ZT is normal so suppose you start with ZT normal when you build this process you get an XT which is not normal in fact which is leptokurtic so the GOC mechanism creates heavy tails without having to do anything else it will create a heavy tail out of normal innovations and if you give it student innovations with a quite a heavy tail the XT variables will have an even heavier tail and you can also show that the squared process is in fact an AR one process that's quite nice this is a parallel with with the auto regressive regression XT squared is an AR process now it looks like this that's a realization a thousand this is a volatile process you don't immediately see it it doesn't look volatile enough if you like if you remember the 2008 crisis this doesn't look bullet I'll enough but it is a volatile process that's the volatility it's changing in other words what you see in B is the path or the values of Sigma T the conditional standard deviation that changes all the time when you look at the ACF the theoretical ACF is the zero function except when H is zero the theoretical ACF is the zero function so we can generate some data and estimate it and the estimates will largely lie between the confidence intervals because this is a white noise however if we if we look at the squared values if we look at the squared values and compute the auto correlations and make the correlogram we will see that the correlations are nonzero we will see that they are significantly outside the confidence interval that's because the autocorrelation function of XT squared is actually the same as the autocorrelation function of a first-order Auto regression so we're getting some of the stylized facts here we are getting no serial correlation but serial correlation of the squares it would also be for the absolutes we're getting some volatility but we need to do a little bit better so where God comes in it's just a refinement you'll see that the only addition here is another term and this term here is a linear sum of past volatility so Sigma T now references it past values of itself and this creates more volatility this structure creates more volatility so angles PhD students suggested this generalization so if X the previous X's are large or previous volatilities are large XT tends to have a distribution with a large conditional variance so the periods of high volatility tend to be more consistent and there are beautiful parallels with the arch theory you get a condition for stationary which we won't go into and you find out that in fact the GOC 1:1 process so the most popular model in practice is when P equals 1 and Q equals 1 P is 1 and Q equals one that's called GARCH 1:1 and that is like an armored 1:1 process for xt squared which is a process that can do a bit more than AR 1 right yep yeah there's there's there's there's innocent it has to be strictly stationary but that doesn't allow its arbitrary freedom about these offers and beaters so yes I was I will only call it a gouge PQ process of it's strictly stationary but that doesn't mean I can have that doesn't mean that in fact I'm allowed any non-negative offers and beaters in fact I will have to restrict the offers and beaters and this this is the restriction yeah yeah yeah so if you have if you have a process that is defined according to these equations then it is only a gouge PQ process if this this holds yeah yeah okay yeah but what I wanted to do was to now show you this here so we recommend so this is chapter for obvious in what I'll show you first is gotcha simulation is would have a quick look at the stylized facts this package is what Mary is called Ruge arch or are you gosh I think it it was written by someone called Alexius Galanos it's brilliant package it's very powerful there's all kinds of things Alexios works in a quantitative trading fund somewhere in America but somehow he finds the time to maintain this package and it's a kind of all-singing all-dancing package you know just give you a little bit of insight into some of the things that can do are you gosh so you'll see that the package is being loaded slowly the first thing we have to do what I'm going to do is just simulate one of these processes and you'll see here there's a function you got spec I can maybe drop down the the documentation for this package are you gosh and bring up the you see there's lots of functions in it that's a list of the functions in it let's look at it this way you gosh you guys spec yeah univariate guard specification so I just wanted to show you this because it gives you a sense of the the level of detail in this package so when you specify a GARCH model you specify the variance model and I've only talked about arch and guard so what I've talked about in the language of this process is standard got a scotch but there's a burr will during variety of other gotchas that this will do I gotcha he guards gjr Dodge ap got CS gosh and gosh I've gosh there's all kinds of variance on this these dynamics and that's the variance model not only that you can have a mean model so you can model the conditional mean through an armor you can have an armor order you can have a fractionally integrated armor model which i've never ever used moreover the innovations you can choose different innovations so valid choices are normal skew normal student T skewed students etc so there's all kinds of things in here so if you'll see on the left hand side what I do is I choose standard gosh Scot and I set the order to be 1:1 so just to be perfectly clear and at the risk of repeating myself I am talking about this model with P equals Q equals 1 no I have to make sure you're the right way my mean model I won't have a mean model so that's why so I don't have an arm of 0 0 I don't have a mean mortal I could have one it's going to be pure gosh I will use normal innovations and I've picked some parameter values that correspond to those constraints that that we mentioned and so I specify the mortal and you can see if I look at the specification this is a summary of the GARCH model spec the conditional variance dynamics of standard guards and the conditional mean dynamics well 0 0 0 means there are no conditional mean dynamics I'll come back to the slides a little bit later and just write down what it means to have an a model that is both armored and got an armored guard I'll come back to that the theory of that later okay so let's simulate it so this is a nice package for simulating data which behaves a bit like financial return so I'll simulate 2,000 observations I throw away some starting values that's not really important and I'll make two I'll make two paths I will make two I could have 10 pass I could have 20 pass each of length 2000 and yeah you can see that horizons 2,000 and I have two simulations and a little bit of summary information what else shall I show you these scripts are quite detailed one thing about this package it uses the fourth generation of the s language so you find so there's a some people use third generation objects and some people use fourth generation objects the fourth-generation objects are a lot more complicated to understand and when people start using for the first time they find it difficult to work out where all the results are stored in these objects and so there are a lot of commands here get class get slots which sort of analyze the structure of the of the objects created and so I won't go through all of these but if you're wanting to kind of work out what's in them you can work through this script at leisure but that's the that's the path of the volatility in my model I am looking at by default to give I made two simulations but by default it's just giving me the first one so that's the path of volatility so sync my changes and that this one here is the process itself so that is the guard's one one process itself and okay that's a kernel density estimate of the distribution of the volatility sort of skewed to the right not sure if that's particularly interesting or not and that's the kernel density estimate of the distribution of the time series itself now even though I asked for a normal distribution for the innovations the process itself is non normal as we will confirm very shortly so let's jump down a little bit and look at some of this I can this object path contains my simulations it contains two realizations but it also contains additional information like the path of the volatility the Sigma T as well as the XT if I want to extract Sigma T there is a is an extraction function called Sigma and so I can have a look and I can see that this these are the Sigma values for the first simulation and these are the Sigma values for the second simulation the first six and I can yeah that's a preferred that plot so here you can see the first simulation that is a plot of the volatility so you see some higher volatility here and some lower volatility so it's starting to look something more like financial data but still not as extreme in terms of the volatility clustering that's the second bullet the second path I did two paths now let me jump down a bit this is the actual data this is the X T this is the X T and I'm going to now look at this question here does the simulated series conform to the stylized facts does it behave like the stylized facts of empirical finance would suggest that it should okay so there are two there are two realizations here so I'll just take the first one let's have a look at the ACF so this is the ACF and/or the correlogram the estimated ACF and you find what you would expect to find the little estimates at lag 1 2 3 4 are minuscule numbers some positive some negative and only one of them out of 32 or 3 protrudes beyond the confidence interval and this is supposed to be a 95% confidence interval so it sort of looks like so I see this I've got some data and I see this and I think Oh am i dealing with strict white noise could I be dealing with straight white noise well no because what I should always do now is I should look at the ACF of either the absolute or the squared data doesn't really matter which so we look at the absolute data and compute the same picture and you see significant correlations so this is one of the stylized facts that the this simple GARCH model and it only had three parameters incidentally so the just going back up to the where it was specified you'll see that it's made out of three parameters alpha 1 beta 1 and omega if you want to map these to the to the model here well alpha 1 is alpha 1 beta 1 is beta 1 but omega is alpha 0 ok omega is alpha 0 so that's so very simple mechanism but I create this serial correlation and I could do the same thing so let me instead of doing the absolute values let me do the squares and see what happens there yeah so it doesn't matter I do the square so I get the same kind of thing absolutes or squares the squares of our guards 1 1 actually follow an ARMA 1 1 for those of you who know Armour who knows what the absolutes follow but they are or what sort of what the name is of the model which they follow but they they show serial correlation now that was the correlation what about the distribution well that's the QQ plot is it a straight line I've seen worse I've seen worse I'll put a straight line on it so I put I put the I put the I put a straight line on it through the point zero zero and there is this inverted S shape it's not so pronounced but the inverted S shape that we saw earlier is present and theory tells us that these observations are not from a normal distribution so they are not from a normal distribution the GOC mechanism creates heavy tails so they really aren't for a normal distribution I will run one GARCH test sorry one normality test so I've run the I've run the Shapiro Wilk normality test that's done with Shapiro it's it's I think it's in base are you don't need any packages for that whereas for other normal tests you might need some other packages so I will just do the Shapiro test and the p-value is very small so what that means is that the null hypothesis that these numbers could be normal we would tend to reject that at the 5% level and be fairly confident that we were dealing with non-normal data and that would be the correct inference because the GARCH model creates non-normality even out of normal innovations it creates heavy tails or what we call lepto kurtosis which means longer longer fatter tails and a narrower center ok that was the that was the first thing in some of these scripts you see rather optimistically exercises try simulating paths with different innovation distributions so you can go back to the specification and you can change the innovation distribution for example to student and then we would get more pronounced non normalities well obviously if you're feeding it with student innovations why should you get a normally distributed process but you would get even heavier tails yeah if you make the sum of alpha 1 and beta 1 go closer to 1 you would also get heavier tails that is true that and if you studied these things before but that is kind of in the theory here there is a formula for the kurtosis which depends on alpha 1 and beta so I over this point but in Gaucho one one there is a formula for the kurtosis and no matter what the kurtosis of the process is bigger than the kurtosis of the innovations Kappa here is the kurtosis and yes so you can see that what you actually get depends on your choice of alpha 1 and beta 1 but in order that you are dealing with a stationary time series the sum should be strictly smaller than 1 perhaps should have said that if you want to be dealing with a second-order weak stationary process off 1 plus beta 1 should be less than 1 then the variance is finite if alpha 1 plus beta 1 is greater equal to 1 it is in fact an infinite variance process and we don't tend to use that case in financial modelling we tend to work with always with the case where of 1 plus beta 1 is less than 1 and when we fit the model to return data that's what we tend to estimate but the sum can be quite close to 1 but still less than it right ok that was just a simulation and what I want to spend the remaining suit 15 20 minutes doing is fitting 1 2 data so so so I talked about got one with a gouge PQ here then I summarized some of the properties of God's 1 1 there is a slide which summarizes the properties of GARCH PQ but we will jump over this because in fact in practice the most widely used model is got 1 1 so gosh 1 1 is very commonly used sometimes gods 1 - oh gosh - 1 but seldom anything a bit much higher order so let's not worry too much about gouge PQ but the GARCH models effectively are like armored models for the squares that's that's the best way of thinking above of them armor models for the squares this is relevant so this relates to your question Frank how do we model these mean affects the fact that there that probably there is over time some tendency to for returns to for the levels to go up and down so we can combine armor and gosh we can combine gouge for the conditional variance with Armour for the conditional mean and I've sort of defined the most general model here as in the book but basically the thing to look at is that we we set up a model like this we have XT is mu T plus epsilon T and we have epsilon T is Sigma t ZT Zed T is the strict white noise so this is the strict white noise the mutti bit follows on so-called Armour specification whereas the Sigma T bit follows a guard specification so the net effect of these two parts is obviously I have a model of this form and so this is known as an Armour model with GARCH errors and this is the kind of model that are you gosh will fit to data and armor model with GARCH errors these are quite flexible models mutti is the conditional mean Sigma T is the Sigma T squared is the conditional variance as you saw in the package documentation there are all kinds of other models and one of the one of the main features of these other models is introducing asymmetry so particularly with price series there's a suggestion that a a large drop in the value of a price series will lead to more volatility and a large rise in the volume a price series there is a kind of asymmetry of that form in the basic Gaucho the effect on the volatility is the same regardless of whether it's a large drop or a large rise but that may not be what you want in reality so various economy trations have collected evidence where they suggest that they call it a leverage effect for reasons I won't go into if I can even remember but but anyway that the the inference is that you should have more volatility if there's a drop than if there's arised and so you you introduce some asymmetry into the dynamic equation this is really popular gjr these are three people mr. G mr. J and mr. R look in the a lot of the first ones glossed in' forget the other two but it's it's cited in the book this is very popular threshold GARCH is also popular so there are two ways of adding asymmetry to the model one is this kind of idea you introduce another parameter in the volatility equation so the volatility reacts asymmetrically to recent values the other is that you use an asymmetric innovation distribution so you don't have to use a symmetric normal you can use a skewed normal and you don't have to use a symmetric student you can use a skewed student so that there's two ways of putting asymmetry into the model and my experience suggests that they both can have value okay how do we fit them to data well the short answer is maximum likelihood so we use maximum likelihood and so I will just do that but what will happen in the script is I will take some data I will fit two GARCH models I will fit a GARCH model with normal innovations and a GARCH model with student innovations and I'll try and work out which is better and I will look at the so-called residuals as part of that so the residuals of the model in a in a GARCH model this is the basic structure here okay if we have an armored conditional mean term we have a mutti but often this is zero and the residuals that we will be calculating are the ones at number two here the so-called standardized residuals so after we fitted the model what we try and do is we try and reconstruct the noise the idea is that we reconstruct the noise so Zed T hat is supposed to be our estimate of the noise so that's the standardized residuals and basically we we have XT we may estimate mu T that may be 0 may not be 0 we subtract that to get an estimate of epsilon T and then we divide by our estimate of the volatility to get the residuals and the what we're assuming is that these this innovation process is strict white noise so in order to judge where the model is good we test the residuals against strict white noise as you will see so this is this is the one called Gaucho estimation and I'll need I think I have everything I need here but take the take the standard and poor's 500 that's what that's our that's the index okay compute the log returns and take four years of them I should probably plot that plot do it zou SP 500 dot R ok so that's the data we're dealing with and it spans the financial crisis so the plenty of volatility there and it's four years of data so there are here let us say around about a thousand values because they are daily so the time series is of length roughly a thousand to fit a model I first specify what I want so you will see here that I will fit a gaucho 1 1 model but I'll put a little bit of armour structure in fact I'll take a first-order Auto regression just so you see how that works and I will first assume that the innovations are normal so the spec goes as follows I take a gaucho of order 1 1 gaucho order and I take a mean model which is of order 1 0 that's an AR 1 there is also a non zero global mean a non zero global mean that is in this process here I have mu P 1 is 1 and I have a Phi 1 and Q 1 is 0 so I have a mu and I have a Phi 1 and I have an alpha naught and alpha 1 and a beta 1 those are my parameters so I specify the model take normal innovations and I fit it so the function that fits it is you got fit seems quite powerful and fast so I fit the you guards model and when you look inside this object it contains an incredible amount of information and I'll just I'll just omit all that information you'll see all kinds of numbers all kinds of numbers and I have to scroll up quite far to get the bits that I want to talk about GARCH models fit God's model fit so the conditional variance dynamics are God's 1:1 the mean model is ARMA 1 0 which is a special case of our FEMA pqd distribution is normal so these are all my parameters mu is a global mean a R 1 is the autoregressive term Omega is like the the Alpha 0 and alpha 1 and beta 1 if you sum these numbers you will get something which is smaller than 1 but not much so this is a covariance stationary guard model so it's the usual thing you get estimates you get standard errors and then we we have a look to see how this looks so in the script we plot a few things so okay so this this is one plot that it gives you there's a menu of possible plots which you can have a look at if you just do plot that takes you into a menu system and you can choose about I can't remember eight or ten plots or something this one here it shows a series and what it calls one percent bar limit so these are estimates of the conditional quantile of XT both the one percent and the ninety-five percent will come to how those are calculated very shortly what I will do is add four pictures so these four pictures here what do we have we have the ACF of the absolute observations now the Ru GARCH has its own plot so it doesn't use the standard ones so it uses its own kind of customized plots but this is like an ACF plot it doesn't have the customary bar of size 1 at 0 but the blue bars are the estimates you can see the red confidence intervals and you see all the estimates are outside so lots of correlation there the standardized residuals the standardized residuals if this is a good model the standardized residuals should behave like strict white noise because the standardized residuals are a reconstruction of the of the innovation model of the strict white noise innovations so we do the usual thing we make an ACF coral a correlogram and we do it for the for the residuals we also do it for the squared residuals and you're supposed to see here that 19 out of 20 of these values lie with in the confidence intervals and it's I don't know that we have three out of 30 outside so maybe that's too many and two out of 30 outside here so one out of every 15 outside it should be one out of every 20 so but no obvious structure so as we have no obvious structure in comparison with this where we have all these nonzero correlations when we look at the standardized residuals it's not obvious that there's any correlation left in the residuals or in the squares or in the absolutes but if we plot them we do a QQ plot against the normal distribution we don't get a very straight line we get something a little bit weird with an awful lot of curvature that's because not only is the normal distribution a bad choice the tails are too light but there's actually a lot of asymmetry here as mentioned there's a lot of asymmetry here so upshot is this model is not good enough this model is not quite good enough so I go on to change the innovation distribution the also the only thing I change as I change the distribution models is student otherwise the specification is the same the same variance model the same mean model but the distribution changes and I fit that one and I get a fit and we have a look at we just look at the same numbers so what's changed here well the only thing that's changed the distribution is now STD for students and there is one further parameter shape so there's an extra parameter has popped up and this of course is the estimated degree of freedom of the Student distribution so this is just one more parameter and sort of mentally remember these four pictures remember these four pictures in your mind and now look at these four pictures okay so that hasn't changed that's barely changed that will okay there's only one out of thirty outside now that has straightened up a little bit that's straightened up a bit but there's still sort of curvature in here and so I wouldn't be completely satisfied yet but if I formally compare the model with normal innovations with the model with student innovations I would say the model with student innovations is much much better a couple of things one can look at one can look at one can look at the log-likelihood at the maximum in the student model the log-likelihood at the maximum is almost 30 bigger which in log-likelihood scales is a lot better we could do a like so now we get into well how do you test formally for one model being better than another you could do the likelihood ratio test for example you could test in the within the student model if you like the normal is a special case of student with an infinite degree of freedom so you could test the hypothesis that the normal distribution is sufficient against the alternative that the t distribution would be better so that's what I've written here actually so you see the annotations the normal model is nested within is a special case of the T model so you can do a likelihood ratio test if you do it this likelihood ratio test you get a significant result this is the that's the likelihood ratio test statistic it is much much bigger than the corresponding quantile of a chi-squared so this is a very significant likelihood test result you would reject the normal special case you look puzzled Frank yeah parameter that's the degree of freedom here so H naught normal innovations are good enough alternative T innovations necessary we reject the null in favor of the alternative if the LRT exceeds the 95th quantile of a chi-squared with one degree of freedom here I'm taking the difference of log likelihoods so I'm taking the difference of log likelihoods which is like the log of the ratio yeah you can also do if you know about the acai a key comparison that's also the the numbers are also hidden in the results you can find the Archaea key anyway what you will find is that the T model is much better but you could do even better still and so let me just scroll down to the bottom exercise here's two ways you can make it even better you can take a GARCH model with asymmetric dynamics that's better significantly better and you can make the innovation distribution asymmetric you can also try an ARMA 1 1 instead of an AR 1 all of these things I think will significantly improve the fit so if you like I've only just begun and I'm not gonna go any further but all of that would do better why so what's the whole point of doing this we can describe the risk factors but one thing we can do with these models is we can forecast volatility so I've just executed a command here called new GARCH forecast so I've taken the better model the T model and with you gosh forecast our forecast volatility so my best forecast of Sigma for for the next day is this and then in fact this is a forecast for day two plus two T plus three T plus 4 these are forecasts of the series itself so it's about time I think of wrapping up and let me just pick out a couple of things that so you'll probably see that there's like four or five slides left let me pick out just a couple of ideas and their remaining five minutes and then as I say you'll never have to sit through an hour and a half again in the afternoon just to 8:30 in the morning so we use the residuals to check the model that's quite that's an important part of the model building process you fit it you extract actually number two what are called the standardized residuals these should behave like strict white noise so you check you check both the residuals and the absolutes or the squares one or the other you make correlograms you can apply the little box test that is actually in the model output I didn't show it okay there is another procedure which is quite common what I did was MLE maximum likelihood but there is a procedure called qm le which is a fancy name for just using the gaussian likelihood so just using the case of normal innovations that's called qm le even when you don't believe that that's a good choice of the innovation distribution there is some econometrics eery which suggests that QM le estimates of the dynamic parameters are not too bad okay so some people actually do this in two stages they estimate the dynamic simply using the Gaussian innovations called QM le they then extract the innovations and then in a second stage they they start modeling the innovations with the best distribution they can that's possible and sometimes it's quite pragmatic when you have many time-series a use of the model is volatility forecasting and I think for volatility forecasting really we just need to look at two equations so I've jumped onto page 90 after you've fitted the model how do you forecast volatility well it's the purple one here how do you forecast volatility well you have all the data up to time T let's say so XT is your last observation you estimate the parameters alpha 0 alpha 1 and beta 1 and then what you need is an estimate of volatility on day t which you have in the model you have an estimate of the volatility on day t and so by using this formula here you get a forecast of volatility for I've said day but for the next time period if it's a we if your data weekly you would have a forecast for the next week if your data or daily of a forecast for the next day it's just the defining Gaucho equation but you replace the parameters by their estimates and you also have to put in an estimate of the volatility on day t and so you get a kind of recursive scheme that from the volatility on day t you estimate the volatility in day t plus 1 and then you can use that to estimate the volatility and a T plus 2 well actually it's a bit more complicated than that before the formula for the volatility on day t plus 2 is actually given here this is a general formula for volatility forecasting now final idea there is a poor-man's alternative to gosh which involves no estimation at all and this is what the risk metrics group pioneered at JP Morgan originally and it's the exponentially weighted moving average filter Atilla t so if you have a look it's too much detail but if you have a look at this equation here and now I will show in the exponentially weighted moving average scheme remember that as Frank observed alpha 1 plus beta 1 is close to 1 so the song of alpha 1 and beta 1 is close to 1 and so look at that equation here a very simple volatility forecasting scheme is this one here this one here it's a little bit unfortunate that I've put in a mean term that we could make that zero so we eliminate the mutant or just set it to zero you get a different volatility forecasting scheme this is called exponentially weighted moving average and it looks much the same as gosh except that there is no alpha 0 and instead of alpha 1 and beta 1 you have alpha 1 minus alpha this was suggested by risk metrics tends to work quite well there is a little bit of theory it's a little bit like estimating a model called I got and I got model as a GARCH model where beta is constrained to be equal to 1 minus alpha and alpha 0 is set to be 0 so the exponentially weighted moving average model is a little bit like a so called I got your integrated gosh why is it called exponet you're not supposed to ask the difficult questions you're part of the team basically because if you iterate it's a scheme like this and I think we removed it from the slides it's in the book but if you iterate this scheme you see that the forecasts are a linear sum of weighted previous values and the weights decay exponentially an exponentially weighted smoothing scheme if you it's a I've not don't have time to go into it but if you iterate this equation you get it involves all the past volleys of the process with exponentially decaying weights my pleasure the yeah here's the nice thing about our scripts is they run really quickly so I just draw a picture maybe you can see this here this is my this is my final offering of the day yeah this says these are different volatility estimates the black line is got the dotted red line is actually the exponentially weighted moving average scheme it's pretty close to God and there is a magical choice of the parameter that was recommended by the risk matrix group which is alpha equals naught point naught six I've called the the one minus off is lambda naught point nine four and so it's a simple way of estimating volatility which has some relationship in theory to the GARCH model I don't we don't obviously have time to go into that but it's described in the book and at length and it's as well to be aware of it I call it poor man's Gaucho because it's sort of based on an in less and less of a formal model I gotcha is a bit of a funny model in fact whether it's a valid mathematical model or not is debatable if you could so people would probably say well it's expensive to do all this estimation and I have ten thousand or time series and I don't want to fit in god smaller to everyone it seems now that that's not a big excuse because you can fit 10,000 GARCH models really quickly yeah but in the old days that was in the excuse yeah yeah it's in the 80s any we're here we're here again tomorrow so but would you like to why is your just one question sorry yeah yeah yeah yeah you can put a confidence interval around the volatility estimates it's not reflected obviously in in what we've done here but yes you can put a confidence interval around your estimates of Sigma T or mutti if you have it you can make a predictive interval for XT or for XT squared you can do all of these things as well whether it's completely there is I I think there's enough there to work it out it may not be completely explicit right so okay so you probably if you've been checking the time tail of our progress probably realize we're a little bit behind schedule the final hour was to be finishing off the multivariate story in the sense of bringing together the dynamics of yesterday time series with the multivariate distributions of this morning by having some multivariate time series and in fact I will take half an hour to do some of that it'll be a little bit abbreviated components is so fundamental and that we will is spending more time on it on Thursday morning we come back to cobblers and dependents and we recap some of these concepts and we explored the world of dependent risks in more detail but we have the elements now to talk a little bit about multivariate time series we have we have the idea of arching God so on this computer here I've brought up arch so you remember you remember this idea that there is a volatility and this volatility depends on things in the past you have arch you have if you prefer GARCH which has a more complicated form here so in the one hand we have this and then I will bring up something I'll switch over the view in a second when I have what I want to show you some of the things that Paul talked about this morning Paul talked about this one here now let's see we have so many basically I want to take the idea behind formulas 16 this form this this formula here maybe not so clever to change in this place like that but I wanted I want to blend these two ideas so I want to have a model which is multivariate so it has some explicit dependence on the past but I want to do it for a vector-valued process so that's where multivariate God comes in now I'm confused which I'm gonna go back to this projector here so for the last time make sure it's completely happy and we're gonna go on to the case of malt we're going to go on to multivariate got actually at this point 386 let son hold waiter so this is chapter 14 so we're no longer following the strict secret what did you tell me it was ringing 386 there is a quicker trick go to but sometimes one just persists with the with yeah so we go into multivariate time Susan now I won't go through everything here it's included it so there's enough here that it tells a coherent story so what is room what is interesting simply is that it's the same as the univariate story all the same building blocks but just for vectors so it starts with the story of the moments of a time series it goes on to the idea of stationary what does it mean to say a vector-valued time series is stationary again you can have strict stationary you can have covariance stationary and once you have established or defined a stationary process you can begin to look at the the autocorrelation function although in this case it's called the correlation matrix function it's a matrix volume thing in this case because we have not only correlations within time for different lags but we have correlations across components so we get something which is matrix value now I'm not going to go into that I'm just going to point to this and say that we have something very similar and I'm going to go straight to the multivariate GARCH process in the straight to the multivariate GARCH process just dwelling on one definition on the way we have ideas of noise in multivariate processes we have the idea of a multi very white noise but what you'll need for the definitions is what is multivariate strict white noise well that is really simple xt is a multivariate strict white noise if it's a series of independent identically distributed random vectors it's just the same definition but for vector objects instead of vector-valued random variables instead of scalar so that's about as much as we need to go straight into multivariate gods processes and the idea short definitions and then to estimate them from data and get a feeling for what is possible so this looks like gosh so I'm going to say what's a multivariate gosh process you build it from strict white noise but you're going to need something D dimensional by the way we might introduce the dimension first so we might have gone from 10,000 done three or seven or something so let's say D is more modest but it is bigger than one and so what we want to do is we want to create a process out of strict white noise by multiplying by a matrix and this matrix will depend on the past values of the process so it will depend on this Sigma algebra here which is constructed from the history of the process up to time t minus 1 now how is it how exactly is it going to depend on the past well that is the art of building the multivariate GARCH model there are all kinds of ways and in econometrics many papers were produced with specifications that were more and more fancy and had more and more parameters and were more and more difficult to estimate but I think really there's only one or two models that are feasible in dimensions higher than 2 or 3 and those are the ones I'm going to focus on so when you have a model of this kind I don't have a I don't have a mean term here that T is strict white noise centered at zero so ZT has some distribution it could be multivariate normal it could be multivariate student in the same way yesterday we had normal distributions we had student T distributions it would be something like that the way it's set up the expected value of XT given the past of the process is 0 but just as we had for the multivariate normal distribution get the matrix of XT given the past by taking a tee times itself and this is what we call the conditional covariance matrix so we have a model where there is a conditional covariance matrix and what we have to do is we have to describe how this conditional covariance matrix depends on the past and I will use two specifications a few notes we could add in a conditional meter or if you remember yesterday we had armored guards we have got for the conditional variance we had armor for the conditional mean so you could do the same in soon well here's two options you this could be a constant constant conditional mean or there are vector armor processes of bomber processes there's a simple one which is sort of Li the analog of a R vector autoregression aboard of one when you do that so you could do this but but I for my examples I will have a 0 mean term and I will concentrate on modeling a T this matrix a T or actually effectively modeling Sigma T there is a very neat decomposition of Akko grants matrix in terms of a sandwich here Delta T P T delta T these are three matrices so you have to think about this the delta T is a diagonal matrix containing the volatilities the conditional standard deviation so it's D PI D and on the diagonal it has the conditional standard deviations PT has the conditional correlations YP incidentally Latin capital Latin P is a capital Greek row and rows the favorite letter for correlations that's a big role the conditional correlation matrix and then the matrix with the conditional standard deviations so the whole art or science of building a multivariate gotchas to say exactly how these elements depend on the past how does delta T depend on the past how does pt depend on the past and moreover you've got to do it in a sensible way so that these conditional covariance matrices always remain proper covariance matrices you've heard from Paul that they have to be symmetric and positive semi-definite well let's actually make them positive definite so we don't run into problems so we we need this to be guaranteed for the innovations the zetz once again normal or student or indeed as Maris's introduced them anything spiracle would do for any distribution with mean 0 and covariance matrix the identity would do in here and I'm going to use another library by Alexius Galanos and of course Alexios has put in one or two options for the multivariate innovations as well so the first possibility that is the ccc this was suggested by angle stands for constant conditional correlation gosh so we have a process of multivariate gouge form where the conditional covariance matrix depends on the past but this sandwich is simplified the sandwich we have Delta T but the PT will be held constant so PT will be a single constant correlation matrix positive definite Delta is diagonal it contains the volatility and the elements will follow ordinary so you have the Gaucho equation that we revisited and that's what we're going to do for each component series alternatives are of course possible you could put in here any of the univariate God specifications you can have them the ones with the asymmetric dynamics according to whether you have a pole or a rise in the in the barrel your modeling so you sir anything can go in here now in this process if you take the matrix of volatilities this type of matrix of volatilities if you take its inverse and you multiply by x and you multiply xt by the inverse we will call that YT and this has the effect this operation has the effect of removing volatility so doing this to xt in this model will remove volatility we will get YT which I would call the d volatize process and this will fall a strict white noise but with a covariance matrix equal to this correlation matrix so that's just that's sort of easy to check mathematically from the definition that the YT these are in fact independent vectors with correlation matrix PC so this gives a clue about estimation this fact here gives a clue what if we take each series one at a time estimate a gaucho like this and then after we've estimated the gaucho that's equivalent estimating the Delta matrix we then perform this operation we devala ties and we get YT and then to YT we interested from YT we can just s to make the correlation so that's what I'm saying here estimation can be accomplished in two stages fit really very large models to each component series form some residuals some D volatize data of course whether it really is depolarized and not dependent from the quality of the GARCH model the children or the quality of the fit of the GARCH models you've chosen but gods one one will probably do a reasonable job and then all you have to do is estimate this correlation matrix so that's easy to do which is a little more challenging this model is very much the starting point and from which to proceed to more complex models and I think it's generally the data that it's it's not good enough in most cases in some settings it might give an adequate performance but it's generally considered that the constant conditional correlation is unrealistic and not only should the impact of the news be felt on the volatilities the standard deviations but it should also affect the dynamic evolution so CCC is not really used so much instead the model which is used and the one which I will fit is DCC dynamic so go back to 76 what has changed everything is really as in 76 except that the PC we no longer have a constant PC we have a PT and we have to decide how PT depends on the past and the way we do it is through this strange equation here 77 so we have three bits this is this is something analogous to first-order gosh but for the correlation matrix so there is a constant term there is a PC there's a constant term and then there we add on combinations of the demonetized process times itself at lag values in effect we are going to choose P equals Q equals 1 so just think of one like value and we will also take one lag value of PT so we'll take PT minus 1 but by the time I add this and this and this what I have in brackets here is probably not a correlation matrix so I have this funny thing here because when I have these three things up I get something which isn't the correlation matrix it will be positive death that's ok but it will be a covariance matrix and so I have to do this little operation that takes a proper covariance matrix and changes a little bit correlation matrix yeah they have nothing to do with the office and basis and gods perhaps unfortunate that these are not the artists invaders from the GOC these are some new authors and betas specifically for the correlation matrices so I think everything here we've talked about this is a constant correlation matrix if you like this is the long-run correlation matrix Rho is an operator that just gets correlation matrices for various matrices YT is the D volatize process so dividing through by the vault linking the standard deviations and actually it's going to be the case that these alphas have to sum to something less than one to get a model that doesn't explode and make some kind of sense now here they needn't be strict but in this model they're probably not gonna be strict white dice there's going to be some dependencies in them they are just D volatize but they're they're not independent in this model in the previous model they were in the CCC it's a little bit tricky so CCC is a special case when when those offers and pieces of zero the by analogy with the univariate GARCH model you can sort of see what happens in a univariate Gaucho in fact you have an updating equation of this form the volatility equation can be written like this that Sigma T squared is given by the unconditional variance should be a square there term involving squares of xt and previous variances so this is for gaucho itself so if we compare that one and that one you get an idea of what is happening so you can think of thus in DCC there's pc you can think of it as representing the long-run correlation structure so how do you estimate it or how does how do most people estimate it and how does Alexius Gowanus estimated in his our code well you fit univariate GARCH models to each see each series that gives you an estimate of Delta t UD volatize you get YT hat you estimate the long-run correlation structure by applying some correlation estimator to the YT hats and then finally in the third step you estimate the parameters governing the dynamics of the correlation and so of course when you see these multi-step methods you ask you know does this give good estimates does it give consistent estimates does it give efficient estimates and in the econometrics papers they they've been through the asymptotics and decided that that is the case we will we will just fit the first order model will fit the first order model P equals Q equals one we'll do it by well the final stage stage three will be done by a maximum likelihood method or a conditional maximum likelihood method I think this slide is not worth going through instead what I want to do is fit it and then come back to a few a few points
Info
Channel: QRM Tutorial
Views: 2,713
Rating: 5 out of 5
Keywords:
Id: rc3W4B3lIoY
Channel Id: undefined
Length: 111min 0sec (6660 seconds)
Published: Sun Jan 21 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.