Become a generalized linear model ninja. Trade secrets they don't want you to know about

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome back and today as always we have an awesome guest so please introduce yourself sophistic yes some mystics expert I don't know about you but I'm excited to start talking about generalized linear models and here are our objectives understand the difference between random link and systemic components of a model understand the difference between transforming Y and the expectation of Y understanding under what conditions you'd use a Poisson negative binomial etcetera understand how to choose between different families of distributions so those are our learning objectives right you are good sir so what the blip is a glim anyway well glimpse or generalized linear models in standard linear models or general linear models not generalized but general linear models we assume the data are normally distributed if data are normally distributed we know how likely it is to have for example a standard deviation of say 1.96 or the probability of having a 1.96 or higher being equal to an alpha level of 0.05 so all statistical conclusions are based on how frequently scores of a particular Z fall makes sense but what if the standard normal distribution doesn't fit then the Alpha level will not be 5% anymore y'all want to see a picture I know I do look at that people so the red is a normal distribution and the black is a beta distribution under some sort of parameters and the blue line is the critical Z of a normal distribution now notice that the area under the curve for the normal distribution the red is probably not going to be the same as the area under the curve for the black so if that is 5% under the normal it's probably not going to be 5% under the beta so type 1 error probabilities are probably going to be messed up but you know what confidence intervals are gonna be messed up too cuz the T distribution with Z distribution or whatever plays into the confidence interval but even then your estimates are gonna be off everything's gonna be off so if you're in a situation like this where the residuals are not normally distributed well you best be using something else my friends so take-home message the probability associated with extreme scores on a normal will probably not be the same as extreme scores on a beta distribution or a Poisson or a gamma or a zero inflated etc etc etc so when you use glimps we use them if the residuals are not normally distributed instead we want to use glimpse or generalized linear models so some examples here let's say we're predicting death from heart surgery is binary variable expected to be normally distributed I don't think so what about predicting sexual-assault do we expect assault to be normally distributed boy I hope not because most people will probably not do anything close to sexual assault and as and then less and less people will do extreme instances of sexual assault as you see on the graph on the left there or another example might be predicting the number of times someone seeks therapy most people don't seek therapy and the probability shrinks as the number of times increases so these are situations where we would use glimps so how do you model that 3 component 2 in general linear model bow first it's the link function the second is the whatever that was but this dilution of the residuals third part you are so right you are so there is the systemic component which is the part that we are all familiar with this is the part of the model where the prediction equation goes so in the past we've seen equations like b0 plus b1 times depression plus b2 times anxiety well guess what folks when you're using the glim it's exactly the same whoo so it's gonna be a b0 plus your slopes times your variables same same nothing complicated there that's awesome and what is different however is the link function so the link function is some fancy pantsy mathematics that bends the line so on the right there you see several examples of ways that you can bend the line so the top left is y is equal to x squared top right is the square root of x plus 10 and I did X plus 10 just because you can't take the square root of a negative number otherwise the mathematics cries and stuff and then we have the exponent and then the inverse function and then the logistic or the logit function at the bottom left and then the log function at the right so often we use linked functions to maintain the proper interval so for example the logistic it should not go higher than one or lower than zero because your probability can't exceed one and it can't get below zero and so we use the function e to the b0 plus b1 times X divided by one plus e to the B 0 plus B 1 times X etc so that's the link function and then third what was it again so how are the residuals distributed are they do are the residuals distributed as a normal are they distributed as a beta as a gamma as a binomial I know I know this is all like scary terms and stuff exactly like my buddy here just said no need to be scared folks so in ordinary regression we've actually been using limbs all along generalize linear models because general linear models are a special case of generalized linear models so our systematic component in the past has been for example b0 plus b1 times depression plus b2 times anxiety and our link function was the identity function which just means you multiply everything by one if you multiply by everything by one it just stays the same and the random component was the normal distribution so that's all there is to generalized linear models and so here are some examples of different distributions and by the way one thing to be clear here is you don't like look at the distribution say oh yes that must be a Poisson distribution or that must be a gamma distribution that I'm not expecting you guys to do that because here's another thing is you can make just about any distribution look like a normal distribution with the right parameters you can make a Poisson distribution look normal you can make a binomial look normal and so you're not supposed to I'm not asking you to look at a distribution say oh yes I Intuit the shape of the distribution or the family of the distribution whatever you have to do that these are just kind of examples of how they might look for example the bottom right a beta that is a beta distribution under certain really weird parameters but it's going to change shape if you change the parameters just like with the normal you can change the parameters you can change the mean and the variance to give it a different location or a different spread one thing about the normal is it is always symmetric whereas the otherwise the other ones may or may not be symmetric depending on what the parameters are so what is the difference between a link and a transformation I know you're asking folks so a link function is different than a transformation so remember we used to transform the dependent variable if it wasn't normal like we use the box-cox transformation or something like that now a log transformation is not the same as a log link what is the difference well if you do a log transformation you are transforming Y itself so for example you might say the log of Y which is equivalent to saying the log of B 0 plus B 1 times X plus E because remember that is our prediction equation and we add a in there just to represent the fact that that's what Y is equal to so that's what a transformation looks like where as a log link you don't transform the e you say log of B 0 plus B 1 times X or a different way to put it is you transform the expected value an expected value as another fancy way of saying the mean so you are transforming the link minus the systematic component which is equal to the mean of Y so is e included if so it's a transformation if not it's a link make sense more on link functions so different distributions have different default link functions so here on the right these are our is default for different distributions so for a general linear model or a regression or ANOVA or a t-test or whatever the link function is the identity function which just results in b0 plus b1 times X and the assumed distribution excuse me the assumed distribution is the normal with the logistic regression the link function is called the logit it's e to the BLA divided by e to the bla and that distribution is the binomial distribution and then a Poisson the default link function is log which means well I'm talking about the X or the natural exponent here but anyway it depends on whether you're talking about Y or the expected value of y anyway so e to the b0 plus b1 times X that's the link function and the distribution that we assumed is the Poisson and then the gamma is its default link function is the inverse which is 1 divided by b0 plus b1 times X and the distribution associated with a gamma is not surprising a gamma and it's kind of odd that somehow culturally things worked out this way we call logistic regression logistic regression which is named after the link function the logit function and yet we call the song's gamma as betas all that after the distribution not the link function call figure so how do I know if the link is a good one well for now just stick to the defaults are chose those defaults because they're probably pretty good defaults in general the type of analysis dictates the family so really you don't have to generally choose what the link function is gonna be instead you choose the analysis and the family and the family chooses the link for you but also I would look at the visuals if the fit of line doesn't pass through the center of the data then maybe you should try experimenting with different leg functions or maybe even different families so now again like I said you don't have to worry about the link function you don't have to worry about choosing the link function for the most part instead you have to worry about choosing the family so how do you choose a family and by the way when I say family I'm talking about what distribution of the residuals for some reason statisticians call it the family so how do you choose a family or how do you choose a distribution for the residuals well let's just go through these one by one with a binomial distribution we call this the logistic it is great for binary outcomes and the link here is the logit and we are very familiar with this logistic regression because we covered it in detail last week so I'm not gonna go into much more detail on that deal now a Poisson distribution distribution or a pool a song plus song or song I like saying Poisson it just sounds cooler than Python sounds like I have some sort of a speech impediment if I say pull us all the puss on the Poisson distribution is great for skewed discrete distributions and the default link is the log and we typically use this when we have count data for example for example for example the number of times you visited the doctor in a year the number of times you have in Disneyland the number of times you have attempted suicide the number of times ba-ba-ba-ba-ba and the Poisson distribution we make a critical assumption we assume that the mean of the distribution is equal to the variance of the distribution which we'll talk in the next slide why that might not be or why that's an important assumption and how we can generalize that so on the right there is just a graphical depiction of the criminal data set that I simulated in R and it shows a relationship between socioeconomic status and convictions and that is modeled with Poisson regression now that brings us to the negative binomials so the negative binomial that not just sound cool like negative binomial you can impress your friends when you say I read a negative binomial distribution with a log link in a generalized linear model and I found that that fit far better than a Poisson distribution when I use the AIC I see and Bayes factor your calculations doing my model comparisons Wow I just found it smart okay uh so negative binomial it is also good for discrete distributions and what I mean by discrete is you have count data you have integers you have ones twos threes fours that sort of thing you don't have one point two one point seven five eight four three two if you have any sort of data like that logistic and the negative binomial probably aren't gonna work so default linked for the negative binomial is the log S and it oh it has two parameters S is the number of successes and P is the probability of success I'm not gonna go into details on what those but basically is you control as you change those two parameters the distribution may look more and more normal may more and more skewed depending on how those things are put into the equation so it is often a good alternative to the Poisson because here's the important thing is that a negative binomial does not assume that the mean equals the variance so wow this is sounding like a whole lot of technical stuff right now so let me just pause take it to you breath we're good so so far we've talked about the Poisson negative binomial if you have count data or if you have discrete values one two three four five I would probably model it both with the Poisson and with a negative binomial and then you can do a model comparison and see which model is better eh okay we're good don't need to worry about it so now that brings us to a gamma distribution well what happens if you have skewed non discrete data if you have skewed continuous data what do you do then well then you can do a gamma distribution so gamma is great for skewed continuous distributions the default link is one over x at least it is an r and spss i think it might be the log of x or something like that remember so we are using a link function or a link function that is an inverse and so what happens is when you compute the coefficients they will actually be reversed because it is inverted so they might all come out negative when in reality you have a positive relationship between the two so for example and the image on the right we have a negative association between empathy and aggression as people's empathy scores go up their aggression goes down but if you were to use a gamma distribution with a inverse link then you would find that coefficient would actually be positive so you have to reverse that in your mind also by the way the gamma distribution by definition is strictly positive so gammas are always positive so if you have an outcome variable that is negative r will bark at you and it will refuse it will like cross its arms and say am i doing it and so sometimes you have to like like let's say the minimum score is negative 10 you might have to add 10 and a half or 11 or something like that to make them positive that's just the nature of the gamma ordered logistic so ordered logistic is a nice alternative to the gamma it is well ish the order of logistic is great for discrete numerical outcomes kind of like Poisson or a negative binomial and higher values mean there is more of that income so it's an ordinal so you know back in like intro stats we talked about nominal ordinal interval and ratio scales will order logistic is for the second category so if you have a Likert scale data or something like that or even most measures in psychology they probably do well to be ordered by or to be modeled by an order logistic if you suspect that the interval between time points is not equal like the difference between 1 and 2 is probably not the same as between 2 and 3 ordered logistic is perfect for them that's what it was designed for so I often default to an ordered logistic if I have skewed data or another thing that happens is technically when you're doing statistics you should not have when you're using general linear models you should not a minimum value and a maximum value so theoretically you could go any value to negative infinity and any value to positive infinity but sometimes the scales that we use cut it off at you know if you've got a Likert scale from zero to ten you can't get lower than zero you can't get higher than ten well that actually can mess things up in general linear models so order of logistic would be great for a situation like that so ordered logistic is an extremely flexible glim doesn't really make any assumptions and basically what it does is it is a logistic regression on steroids so what it's doing in the background is it is fitting a bunch of logistic regressions for each level so let's say you've got a Likert scale from zero to ten it's going to fit a logistic regression and say is your score zero or not and then once it fits that regression logistic regression it's going to say all right are you a two or not AK technically I think it's a two or higher or a 1 or higher and then it says are you a 2 or higher are you a 3 or higher and then it just does that a bunch of times until it covers all its bases so on the right here is an example of looking at the relationship between socioeconomic status and aggression for various levels of empathy and depression and I'm just overlaying the predictions of the GLM so GLM in this case it was a gamma and i should probably change that in the fight for function to be a little more specific than just saying that it was a generalized linear model anyway so the red is the gamma and the blue is the ordered logistic and yeah so oftentimes they will be similar but like I said the order logistic is more flexible so what about multinomial logistic well I don't have a plot of that on the right there so that's why that guy is doing this anyway multinomial logistic it is very similar to the ordinal logistic or what do they call it ordered logistic ordinal ordered whatever you want to call it it is also used for discrete numeric outcomes but the categories are not ordered so it with ordered logistic you are on an ordinal scale you know there are innate quantity differences between the values of the scale or as with the multinomial it's not so you might use multinomial to predict who's a Democrat who's a Republican who's a libertarian who's an independent that sort of thing so it is also a very flexible glim and like the ordered logistic at two fits a bunch of logistic regressions at each level are you a zero or not are you a one or not etc and I'm going to talk about one more and again I don't have a graphic for this one either this is a zero inflated with zero inflated models ZF should be Z I what am I doing with ZF models we assumed two processes generated the data one process that determines if the outcome actually occurred and one that determines the degree to which it occurred and I actually ran into this a year and a half ago I had a student who is looking at non-suicidal self-injury so people who hurt themselves not with the intention of suicide but just like cut themselves or whatever and it seems to me I mean this was a perfect situation for zero inflating because there is a very big difference between those who engage in Eddy and SSI and those who do not but this student also wanted to predict the degree to which they did that and I thought you know at the time I didn't know about zero inflated models it's like that'd be so cool if there's a model that could like do a logistic to figure out who does and who does not and then fits like a gamma to figure out how much they anyway well it turns out that's exactly what a zero inflated model is so a zero inflated model breaks it up into two components those who do versus do not and then it then tries to of those who do it tries to model how much they do so the logistic predicts the presence and then the other models whether it be a gamma of a saw and a negative binomial whatever predicts the degree so in summary these are our generalized linear models if you got a binary outcome for example dead is not dead we're gonna use a binomial family or binomial assumed binomial residuals with a default link of logistic if we've got a skewed continuous outcome variable we're gonna use a gamma with an inverse link if we've got count data for example the number of incidences of rape we're going to use a Poisson with a log link or we might use a negative binomial when the mean is not equal to the variance and like I said you should probably fit both and we use the default link of the log and then we've got discrete outcomes Oh discrete outcome variable with any shape of distribution and the outcome is ordered in that situation we use an ordinal logistic or an ordered logistic with a binomial link if we have discrete outcome variable with any shape of distribution and they are not ordered outcomes then we can use the binomial family again with a multinomial logistic and finally if we've got if we want to predict the presence of something and the degree and we suspect that those are two different processes then we're gonna use a zero inflated model Oh default link that's kind of a misnomer there isn't it anyway I guess the far right should say what is the name of it anyway my table is messed up that's okay so if we're predicting the presence and the degree the zero inflated model we're gonna use a logistic and one of the other ones so just let's step back for a minute I know it's overwhelming glimpse can be overwhelming but here's the thing okay let me talk big picture for a minute so two years ago ish when I was designing this course it's technically called multivariate statistics and I had some moments of self reflection I said all right what do clinical psychologists which are people don't teach you what do they need to know the most and then I said okay let me translate that what are they gonna be most likely to use and I started thinking all right what do I see in the litter tutor well I see a whole lot of multiple regression so that's obviously gotta go in there I see a whole lot of mixed models those have got to go in there I see a lot of situations where people should use robust stuff so robustness should be in there somewhere and honestly I almost never see glimpse if I see a glim it's gonna be a Poisson but I never see anything else and that's actually a problem that's a huge problem because psychology frequently studies variables that are not normally distributed that's a problem and so I chose to talk about glimpse not because it's common but because it needs to be common because these situations come up all the time where you need to do a generalized linear model so I think most multivariate statistics class they cover a canonical correlation they cover what's that called manova man Cova I never see these in the literature and I think they're totally useless procedures so I'm not gonna teach that instead I'm gonna teach glimpse because those are actually useful and those are actually needed okay and by done with my soapbox moment now yes I am so in summary when data are not normal you may choose to resort to glens and this may be more common than people actually use them so glimpse have three components the systematic component which is very regression like and just like when we use the generalized linear model or the general linear model in the past we can have both quantitative predictors or qualitative predictors in there the computer doesn't care a okay and it also has the link function that is the function that curves the line and then it has the random component which we specify what distribution the residuals take so in the next video I am going to talk about evaluating the fit of the model how do you determine whether the glim that you have chosen is good we can do that visually or statistically and I'm going to show you how to compare two models both visually and statistically I will mention comparing families or comparing the same family with different predictions I'm not sure that I'll actually do that in the video but it's easy enough that y'all can figure it out on your own and then I will show you how to do Glynn's in our so with that here are our learning objectives and thank you very much I'd like to thank our special guests it has been an absolute pleasure to have you here thank you so much for coming and sharing your wisdom and your insight really and such a pleasure and we'll see you next time
Info
Channel: Quant Psych
Views: 1,167
Rating: 5 out of 5
Keywords: Statistics, Psychology, NHST, Null Hypothesis Significance Testing, Philosophy of Science, poisson regression, negative binomial, logistic regression, ordered logistic, gamma regression
Id: 1IbQaBRYt0s
Channel Id: undefined
Length: 28min 33sec (1713 seconds)
Published: Fri Nov 30 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.