Econometrics - Marginal Effects for Probit and Logit (and Marginal Effects in R)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello uh in this video i'm going to be talking about how we can interpret probit and logit models using marginal effects and also how to do that in r so if you remember in the last video i showed you this table of coefficients from a linear probability model a logit model and a probit model uh and we saw that you know it was a little difficult to interpret what was going on with the logit and probit coefficients now you can interpret these coefficients directly if you really want to there's odds ratios involved a lot of stuff but it's a lot easier if we can put things in terms like the linear probability model has it what is the effect of a one unit increase in x on the probability that the outcome is equal to one as opposed to zero so uh we can do that with the concept of marginal effects uh and really what we're trying to get here is what is the slope on this prediction line but that's it what's the slope on that prediction that's what the marginal effect is and mathematically you can think of this as being a derivative if we took the derivative of our probit or logit function uh then it would give us the marginal effect of course now the difficulty here is that the math here gets a little bit tricky uh and that's true for both uh the uh proba and logit um i can tell you that the you know for logic for example uh the the derivative of the logistic function is whatever uh the coefficient is on your variable times the value of the logistic times one minus the value of the logistic that's the marginal effect but you probably don't want to memorize that but the key thing to look at here is is first of all as we mentioned last time the the marginal effect is going to differ depending on what the index function as whole is it's not just the coefficient by itself but it's the coefficient multiplied by this function of the index function so depending so the the slope on your x variable is going to be different depending on what the prediction for you already is what your index function already is right and this makes sense right so we would expect yeah if you're somebody who's down here then your slope should be pretty flat if you're somebody who's up here then your slope should be steeper what we can take away from this is that each individual observation in your data is going to have a different marginal effect based on what their index value is right so this is not exactly the same as what we talked about with in the interactions video where i said that yeah the effect of a variable could be different for like different groups or things like that this is not so theoretically fundamental or anything like that what this is this is saying that the marginal effect is going to be different depending on what your index value already is because if you're down here you don't got much room to move if you're up here you have a lot more freedom uh and so it's not that there's some fundamental reason why it might be differently effective for you or me it's just where did we start it's going to tell us about how much more room we have to move so we have a different marginal effect for each individual observation in the data depending on what their index value already is are they are they are you repredicting that they're in the middle in which case they're going to get a steeper marginal effect are we predicting that they're near the edge where it's going to be shallower so given that each individual has their own marginal effect how can we say there's such a thing as a marginal effect right that if we increase x by one then this is what's going to happen generally as we as we would want right that's gonna be the simplest way of interpreting uh what's going on here we could of course just calculate the marginal effect for each person and show you the distribution of those marginal effects that is something you can absolutely do that is the cleanest and most thorough way of showing a marginal effect but that's often not what we want often we want to have a single number that represents uh the marginal effect for the sample as a whole and then we can also put for example like a standard error on that and see if it's statistically significant or not now there's a couple of different ways we can take that distribution of marginal effects over the whole sample and boil it down to one number uh there are three main ones so one is to get the marginal effect of a representative and that is to pick a particular set of values for an individual and say i want to know what the marginal effect is for the person who has those values and they might that actual person may or may not actually be in your data but you can ask something like okay i want to get the marshall effect for a woman who is 22 years old and has red hair and da da da da right i can plug in all those values see what the index function is and then use that individual index function to calculate an individual marginal effect and that would be the marginal effect of a representative i can also get what's called the average marginal effect that is calculating the marginal effect for each individual in your sample and then just averaging it up and that's your average marginal effect you can also get what's called the marginal effect at the mean what that is is that you first calculate the mean of every single one of your independent variables and then you use that to get the marginal effect of a representative and so in that case you're just your representative is the person with the average values of each of your independent variables this one used to be a lot more common uh it is less common now now i'd say that the default expectation is probably the average marginal effect uh the marginal effect at the mean is a bit easier to calculate because you only need to calculate the index and then the in the link function once but also it means that you're getting the the marginal value for a person who doesn't actually exist right i mentioned trying to get the marginal effect for somebody with blonde hair well if 25 of your sample has blonde hair the marginal effect at the mean will calculate the marginal effect for a person who is 25 blonde right that person doesn't exist so you know it's kind of weird and it also ignores how the different variables might be related to each other uh you know so at the mean you might predict somebody's going to be right in the middle and get a very large marginal effect but if everybody in your sample is actually way over here or way over here and nobody's in the middle well the real marginal effect is almost zero for everybody and yet we'd be giving them a large effect so probably the standard of what you want is the average marginal effect to get the marshall effect for each individual in your sample and then average those out so let's see how we can do that in r uh so i'm going to load up some data so i've got the li the wooldridge library which i'm only using to get this data set the card data set if we look down here what do we got so uh we've got a bunch of indicators here what this is we're going to be predicting is whether you enroll in college in 1976 based on some things like how close you live to the college your parents education uh and i think i'm gonna throw in your iq score there as well so first let's first just show just how to do probit and logit uh and linear probability models in r so first of all linear probability models that's easy we've done this before it's just regular regression so our lpm it's going to be the regular linear model we're going to regress enrollment on being near a college that's two years being in new york college that's four years your father's education your mother's education and then also your iq score and that's from the card data and there we have it we have a linear probability model right there okay so now let's do it with probit or logic this is actually fairly easy to do because all we need to do is take this exact same code and as i mentioned we're doing a generalized linear squares uh so let's just do that so our probit is going to be using the glm generalized linear model and we're going to do the exact same formula the exact same data call but the only difference is that we need to tell it what kind of link function we're going to use so we're going to use the family it's going to tell us what kind of family of link functions we're using and we're going to use the binomial link function which is for binary data and we're going to say our link is the probit link and that will run probit for us logit is the exact same just instead of doing a probit link we just use a different link function which use the logit link and we can run that let's look at all of these regressions together so let's load up the jtools package and let's do export sums lpm probit and logit see what that gets us all right so not a lot of significant here but living near a college four-year college certainly seems to have an effect on whether you enroll the coefficients of course are not easy to interpret because they're just probe and logic coefficients uh but we get a linear probability model effect of about an increase of four percentage points in the probability of enrolling uh with one significant star which is uh at the five percent level okay so uh how can we get marginal effects from this so there's actually a several different packages i mean as with everything in r there's a bunch of different ways to do it marley the most common way to get marginal effects is with the mfx package which contains the logit mfx function and also the probit mfx function for getting marginal effects for those models in particular i prefer however the margins package because what this does is a couple things so for one thing gives you average marginal effects by default that's nice for another thing it gives you the marginal effects for all the variables by default also nice for another thing uh it works with both probit and logit for the same function so instead of the logic mfx and probe and mfx it's just margins as the function uh and then lastly it properly deals with in my opinion it probably deals with things like interaction terms so we have an interaction term in a probator logic function we want to get the marginal effect well what we're really saying is okay great i know that the effect is different for these different groups because i've included an interaction but i want to know the average marginal effect so what is the effect of this variable for you that will already incorporate the interaction term in there so any interaction terms i include in my probiterlogic model are going to disappear using the margins function which is probably makes a lot more sense for calculating the average marginal effect now that said there are some ups and downs uh if you use logit mfx and or probing mfx you will indeed see the marginal effect of the interaction term which might be good if what you're interested in is getting the significance of your interaction term although you want to be careful with using interaction terms and interpreting them in probation models because they don't always mean what you what you think they mean see the slides that talk about this or look at i and norton 2003 so you might want to use mfx for that reason if you're interested in the interaction term the other thing about uh margins is that as of this filming of this video um it doesn't work with export sums unless you install the development version of the broom package so if you want to use this if you're well if you're watching this in let's say 2021 you probably don't need to bother with this step for right now you need to do remotes install github tidy models broom i'm not going to run that because i've already done it but that will install the development version of broom which is necessary to use the margins output in our export sums thing if you don't ever you might you might also need to install the remotes package before you do that okay but we've already done that so let's do margins we've loaded up the margins package and i'm going to do my probit margins all i got to do is margins of probit done and i'm going to do logit margins i'm going to do margins of logit there we go and of course there's a number of different options there if we look at help margins for we can specify margins at different variables um at different values we can choose all sorts of stuff you can choose whether you want to get the marginal uh prediction or the marginal uh value of the link function so do you want to get the coefficient inside there generally you want the response if you're going for marginal variables we can see the output here export sums lpm probit margins and logit margins look at that now you do get this little arrow down here but don't worry about it uh so we can see here that now the results are all very similar uh i promise i did not just put the uh the ols result in there multiple times this is the marginal effect it just so happens that here the ols the linear probability model seems to do a pretty good job of mimicking logic pro but anyway we get a 0.04 percentage point bump in our enrollment rates for living near a four-year college other thing to point out with margins is that we can get the marginal effect for a representative right we can choose a marginal effect either by setting all of the covariates to particular values or just setting one or two of them uh to see and what that does is it says okay what would the marginal effect be if everything else was the same but everybody had this value of this variable that's going to determine where we are in the index and it's going to give us a different marginal effect now to be clear this is not getting the effect among that group as we might do if we were getting some sort of interaction term it's just saying let's set this value and see how that affects the index and then just based on the changes in the index what are the marginal effects there so we can do that with the at option in margins so let's say logit act that's gonna be margins of logit and let's say we're gonna set at and here we're gonna set a list of options and really let's say that uh your iq is 110 let's say uh and if i do this i now have the marginal effect just uh as though everybody had an iq of 110 and we can see the result there uh that was not enough of a change to really change the result but that is how we could set values different variables different values and then calculate the marginal effects from there all right that is it thank you very much

Info

Channel: Nick Huntington-Klein

Views: 11,138

Rating: 4.9813952 out of 5

Keywords:

Id: lJMO5kqWwIo

Channel Id: undefined

Length: 13min 46sec (826 seconds)

Published: Fri Sep 04 2020