Introduction to Lavaan

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
why use LaVon why would you why would you want to use LaVon if you're familiar with the other but that's none of that choice you know you can live in the art world and you can do everything you've done in many ways the mental model of it is very similar to a maths and I believe things like M plus as well the syntax is similar I think it's the mental model of how the syntax is written is quite intuitive if you use if you're familiar with Amos and you're familiar with these tools so in that sense compared to some of the other are packages it's a little bit more I think intuitive and it also has a lot of features so you don't you don't hit the wall of I want to do this and you can't do it as much with LaVon you can do you can do special things for categorical variables if you have like type items you can do growth curve models there's a whole range of estimators and things like so there's lots of features that you know beyond the basic stuff phone so if you invest Destin it it's kind of worthwhile it mean in some respects I would say it's all you have to do is fit a really basic confirmatory factor analysis model and you've got one model and it's 20 items you know use whatever you to me within a month so probably quicker for a lot of people but if they like are they love us okay I think where the man gets really is really great in some in terms of reproducible research you've got a script that you can point to and you can say this is what I did and it's awesome when you you have a series of models or really kind of models with lots of items you can then start to see the benefits of managing that complexity with syntax we're drawing really you know if you've ever tried to draw an a mas diagram with 100 variables or 200 variables and if there's a hierarchical structure of factors and it's like oh my god it's a disaster whereas and then you've got ten different models so in those kind of slightly more complex cases that's where LaVon becomes really lovely it also has a because it is I you can sort of do some nice things where you pull out the fit statistics just the ones you want and you can quickly assemble a table of just the big statistics you like and that can be quite tiresome in temas further things where it can be really nice is you kind of refine a model and you want to have some automated rules says that if the modification indices above this I'll add this correlative residuals so there's ways it's kind of automating that which aren't really available in two seconds so yeah it's a bit more of an advanced tool but I think it's it's not yeah it's fairly intuitive if you're kind of using iron in this kind of tools yes so you say you've hopefully you've got yoga our and our studio installed and you've installed the levan package by running your install packages LaVon I also today we use the project template workflow it's just how I like to do things but it pretty much everything I'm doing doesn't really rely on that but um that's just how I organize my files and folders and I think it's quite nice so if that you know interest here is um I'm sure I just gave a talk last night on it and yes you could read a card if you want yeah so in terms of the getting started with LaVon that the website is really nice yeah this sort of statistician methodologist in in belgium has done a really good job of setting up tutorials and making the package fairly accessible so there's lots of tutorials that we'll kind of take you through different kinds of models so if it's a CFA or SEM and so on it's really that going through the tutorials you know you know trying things out and so on the art help files to Livan are also quite good so so you've got the van loaded the auto completion for the main kind of models are fairly intuitive so you can do things like question mark CFA and you can learn about the arguments for confirmatory factor analysis and so on I just pointed out these probably three help files that don't jump out to you immediately so we'll go into this in more detail but if you want to get information from your model like factor correlations or residuals or parameters or you know anything that you want to know about your model then there's this inspect argument and that allows there's a whole list of things you can pull out of it and this just lists them all likewise if you want to customize how your model is run a lot of the options are sort of specified and explained with these lab Livan options and finally [Music] we'll go into this in more detail but yeah there's also lots of them what they call methods that you can apply to Livan objects to extract various information so if you're familiar with our it's a linear regression in our you'll know that you fit the model and then you can extract summary statistics or various information from that and so there's standard methods like Cole F to get the coefficients fitted to get the values of factors and so on and yes if you use it in a paper make sure you cite and you also some reference can stay obviously put in a lot of work and now they make it available for free so the way of saying thank you is to say yeah and yet I've got some some other CFA videos and lecture materials both in general on CFA and then something a bit similar to this and second exercise authority on the website course but there's also lots of stuff if you YouTube it cool okay so that's all very abstract so what I wanted to do now was give a very basic tutorial on the different functions in the van and then move to a more complex example animal go through an actual exercise so just the way that this project template thing works is we run this first library project template lo project and I think I've already run it so it's it's loaded a file called see cases from this data directory so see cases is lovelies it's just a data set it's a personality data set Scott twenty five big five items hopefully familiar agreeableness conscientiousness extraversion neuroticism openness yeah so you know you can imagine and I think in this meta dr. tonality file we've got all the item information so agreeableness item means I am indifferent to the feeling about this so the negative agreeable inside I guess that's just a standard measure of the big five and you would expect if you got people to do this test and whatever we've got we've got 12,000 people who did this in this test and again you can see the first two rows of head and one row is one set of responses to this survey and I guess we would assume that if we would have factored analyze this it would form five factors and so on yeah so I just thought for this first example it might be easier just to focus on three of the factors and imagine we're going to look at the first three items so look at you know a 1 a 2 and a 3 and see one two three extraversion one two three and just make it really sort of concise so it's not - two - they're both so how would we take reader model to this data in structural equation modelling how would we say there's three latent factors explaining these items and we want to their correlative tractors not anymore so the first step is to load the Livan library I guess because I use project templates is not really what this is about today but I would usually put it in in here but if you don't use project template then you just would like library LaVon and that would make sure that you've got all the functions of LaVon available to you and if you weren't using project templates you could load some SPSS data like this you could use the foreign package and the function reads SPSS and those kind of arguments routinely do a pretty good job of loading an SPSS file if you're not using project it like that that's about getting it good sex valve into your you got yeah this this function just made the variable names lowercase I find it much easier if I don't have to think about whether it's a capital A or a lowercase e or capital A or a lowercase a SPSS is not case dependent so you'll often find you have variable names with a capital A or just visit lower is a capital of summer capital something lower so if you don't want to have to think about that it's nice just to make all the variable mental okay so that's all that functions okay so the model syntax of liván is kind of like this you have if you if you doing structural equation modeling and you try and predict one thing from another then you use this little silver nose above the tab key probably which is in the Buttercup see if you want to say that you have items loading on a factor use equals silver if you want to correlate factors or correlations items for their residuals use double filter and occasionally you use it intercepts master lock okay so what does that look like and so here's a really basic what they could call it you've got three latent factors and each latent variable has three eyes we're just doing a basic model here of saying three factors let's look at three items now and what this says is that we've got a latent variable called three of them and it is represented by three itís a 1 a 2 an HS and let grandma called conscientiousness that has three indicated c1 pieces for any of my clothes extraversion is a latent variable equal filled us to the following items let them be one influences if you're familiar with a moths you'll know that latent variables need to have a metric let me see something to define the scale that they are on and the two ways to do that is either to constrain their variance for something using one or perhaps some slightly more commonly is to constrain the loading of first item to c1 so yeah that then it doesn't affect the standardized loading so it will affect the understand size so by default the levan will assume that its first item has a learning of one and not that this example actually the next example will do it right this example actually yeah in some way as an error in the sense that you need to make certain that the first item is positively worded otherwise you won't get agreeableness you get disagree once so yeah so technically I think if I remember a 1 was a negative item yes and indifference the feeling of others so it's actually a reverse type so it's if I was going to be technical I should probably call this disagreeableness or better yet I would rearrange the items to make sure that the first one in the list was a positively worded I'm okay well maybe I'll do that right so I know how to comfortably so that's possibly word editor let's do that across oops and see one I'm excite exacting in my work that sounds like conscientiousness and E one don't talk a lot that's that's negative extraversion planet difficult approach others know how to captivate it let's excavation let me just move that one across the scale it's not going to affect the model speed or the correlation well it would just mean that everything's reversed yeah okay so what we have here use the model it's the model description and essentially it is that's the syntax and it's encased in quotation marks so what it's doing is its if I run this it's assigning that model string to variable called model and it looks like you may see these /n so that is kind of a convention in our new line so /n means basically enter and by default it won't print like that but if you put it in cat it will print like you would expect to see it will replace the new lines with actual in your lines so if you want to check out what what did it does look right you can put the modeling capital print out properly okay so how do we fit a confirmatory factor now living model and what does that look like so LaVon has a number of model fitting functions and probably the two main ones are CFA and SEM so if what you're doing here the confirmatory factor analysis the main function is CFA if you're doing structure plays modeling well then is this function called SEO the two main arguments are the model syntax and your data that make sense that you've got a model syntax and then we've got our data data frames lots of rows and columns okay the variable names in C cases should correspond to yeah we should have variables called a1 a2 a3 and hopefully you don't are not sure what the rule Habanera if you have a variable called agreements literally ended up on how to check that but definitely these variable names correspond to the variable names in the dark just to reinforce that there they are a 1 a 2 a 3 you want okay so you can just run that and you'll get something not much typical our style if you run a model and you feel it you don't get much output so what we do is we assign that model fitting to a variable so I haven't call the variable fit you can call it anything you like so now we have the object we have there's lots of things in fit and there's lots of things we can get out of it there are what they call methods that I get to apply to a lot of different our model fitting procedures so summary is a common one so if you do summary and sit we will get a lot of the information we might want so at the top we can see the number of observations in our data file what we can see it as a model estimated correctly it converged normally if you have a poorly specified model you could get issues so obviously an iterative fitting procedure so eventually a converge on an optimal solution you get what you might normally have seen referred to as the chi-square value they call it the minimum function test statistic but that's basically taxable so 172 so if you're used to reporting the class ground your confirmatory factor analysis you'd say Chi square equals 172 is 24 degrees of freedom and it's significant so here significance is technically advancing it means that the model doesn't perfectly represent the data but to be honest the p value is almost completely irrelevant in structural equation model because if you have a reasonable sample size you always get discrepancies between what your model is and get the data the question is is it a reasonably good representation and for that we often refer to fit statistics or we might compare multiple models and see well how well how much better in this model compared to some plausible alternative then you get the what they call the the unstandardized estimates and be honest they're usually not as interpretable yeah where are from working to standardized factor loadings or the correlations between factors and salt but you can see here that the first item was constrained to one in unstandardized terms and you can see what the other loadings are like and likewise you can get the covariances their stated areas there whether they're significant and so on so that's all fine but probably what we really want is a few other different things and you can if you look at the help for the summary method you see that there's a bunch of other arguments that can be supplied to it so you can sort of the number of ways you can get skip measures and standardized estimates and so on so for example you can get standardized estimates by asking for them from the summary method by including so this is the sort of the template of the summary thing and you could add standardized it's true or fit measures equals true or anything like that so if we do that we get where are they standardized yeah we get the standardized done estimates and presumably this STD all that's like what we would normally think of is that the correlation and so on so you can see you know agreeableness is correlated with conscientiousness at point two six realness with conscientious by five six you know system I guess it's a yeah extraversion agreeableness and water Scot election you could also check the factor the factor loadings here so you know are they above 0.3 for example might be a definition of a reasonable item and sure enough they all are yeah the other way of getting the parameter estimates is this parameter estimates function which can be a bit nicer I think it's just the parameter estimates so you can use standardized solution to get the standardized loadings and correlations and so on and a nice feature of this is clearly if you start getting very big models like with few modeling a hundred item tests you can have a lot of parameters and you might want to just look at a subset and so if you know about our kind of way filtering getting a subset then it can be quite nice so here I'd say though some standardized estimates you're in an object and I can you can look at these symbols or what's on the left hand side what's on the right hand side so don't just get the loadings you can see that this opie variable has the kind of operator that's going on so if I could say equals tilde then this would just collapse the loadings and that's kind of nice so I could then go okay at a glance are they for example I could go s dot SPD give them to me and I could say well the ABS the absolute loading and I could get the mean that I could say and their necklace I mean you report in the paper I like the mean absolute loading was point six one no that's pretty good you know or you know if you define point three as bad you say look there's zero loadings the low point three and I mean because you could do that visually on a data set this small but if you had a hundred items this could be handy give Yoda and summarize features of correlations or loadings or all sorts of things so that's kind of desperate you get some nice features where you might not bother in Amos going through that process because it'd be awful yeah so likewise you could get just the correlations and that's all just using different filter things so that equals equals is of means equality so where the operator is tilde tilde is correlations and so what if I don't know this is the residual variance is oh that's the Croatian yeah so no you can extract the parameters and play around with them to get what you want and then arrange them into tables probably the big thing that you're going to report begin your rmsea and your SRM are and all the different fit measures you might possibly want so if you do fit measures on the fit object you get I think all the standard fit measures you might want you've got chi-square degrees of freedom CFI tli rmsea SRM are confidence intervals on rmsea and one nice thing which I really like is often you decide for whatever reason by convention you know you've seen the conventions of reporting there are likely to 6-bit measures you want and show until like a monster kind of scouring through the output to get the ones you want but you can just have them listed like this and and there you go you've got just the ones you want and you can see it in the glance and that could go into your paper just a bit more bit more convenient so yeah in this case we've got I guess a CF I that's sort of moderately good I guess above point nine but point nine five rmsea is okay but not great it's 0.08 I suppose but how much and what I know what SRO is at 0.05 or something people often talk about that wonderful aspect but that that's kind of pragmatic so I guess I'm kind of assuming some experience with those rules of some and often it's more about model comparison but so by putting a few different ones and saying is your one better than anything if you want to get more stuff so there's lots of things you might want to know about a model for example you might want to get the variance explained if you were running a structural equation model how much variance is something is explained or you might want to get some correlations between latent factors or you might want to get give the correlation matrix or covariance matrix so all that sort of stuff is listed and is available through the inspect method so essentially you say inspect you pass the object and then you say what you want and do some special word with you're putting quotations to get it so for example to get the r-squared and look that makes so much sense of PFA context but it's like the variance explained by the three latent factors in each of the items relating to the loadings I guess probably the square of square of the loadings of speech in this case then you can run inspect it and then r2 or ask where there's lots of things you can get and yeah you go through if some this help file somewhere down here details you can see all the things you can possibly get some of them might seem a bit obscure but yet if you look around you'll generally be able to find the particular things that you want so for example this is just a couple of common ones you might want the sample statistics might be useful such as the covariance of the data as well as the item means how much people agreed that whatever it was item three men represents I know how to comfort other people the mean was four point five when I was one to six count but anyway sometimes you would want that sample data don't get other weights getting that of course you can also get the loading and correlation matrix or this is unstandardized on standardized loadings and covariance matrix in a more traditional form and you can get standardized loadings and factored correlations as well in a more in the matrix or if you you know familiar with factor loadings you know in traditional factor analysis that's a bit easier to read them that standardized loading that will be formed so yeah another nice thing you can do is if you I think it becomes useful if you're fitting multiple models so you can represent multiple models that's just fit one to two to three it can be quite nice to store them in a list and then you can weep over it those models and extract cosmetics so you just give you a quick example of creative an object called Fitz that's an empty list and that earlier model I said other sighting assigned to the first model and another comparison model that might be to say that the factors are uncorrelated so that can be done by saying orthogonal equals true and that will give you uncorrelated factors and you can even create new models by adding additional syntax to models so if we have this earlier model but say we wanted to allow a couple of items to correlate so we felt that a 1 and a 2 were very similar wording them we wanted to allow them to correlate so we could essentially add that to the bottom using the paste function so it's taken the existing model description and it's added a new line and this little extra bit done and so we could then fit that model and say well the battens residual you prove the model so where it gets really nifty is you have the s apply and L apply functions which are designed to loop over lists and so essentially we can get what this does is it kind of it's essentially saying get for each element in fits and we've got three models and give me the fit measures and I'll just add a bit wrapped it around to give it three decimal places so that gives us we've got three models and we've got the signatures and you can see why I want to round that to three decimal places because it's big difficulty reading scientific notation [Music] and is also still probably a few too many fit measures to be convenient lets them do that hey and now we've got okay that that's something you could actually kind of well it was proper set of models like something you might even put in a thesis or publish where you got I go about three models and we can say that when 0 was under no correlated factors and the CFI improved dramatically by point 1 1 RMS yeah got a lot better at the DSR map so definitely allowing the factors to correlate was important and I can't remember if I chose that those correlating a1 and a2 for a reason but it seems to improve things ever so slightly see if I has gone up by point oh one yeah I kind of what I did it for a reason but usually you you have some justification that that would particularly similar items or something like that and sometimes people like a statistical test rather than just so qualitatively comparing with the see advise so you know there are rules of thumb about what's a meaningful law improvement of CSI and so on but you can do a significance test to compare two models so this is saying that I guess adding correlated factors so 2 e to the negative 16 so it's like a p-value point oooo lots of those and then eventually if it's a 1 so very much significantly improved line so that's often useful yeah you may be thinking about how to improve models as well as you move along you may be thinking oh I haven't got a great fit is there a way to improve things and two ways of doing that is to look at the residuals so to see what aspects of the model are not captured and other ways modification indices so you can get the residual covariance matrix probably the residual correlation matrix might be more a more meaningful metric so this is essentially saying to what extent is the correlation between two items not captured in the data so I think this is saying that said of Croatian soon step 3 and a 2 is quotable it's point 1 for larger than its implied by the model so that sort of compared to the others which are more like 2001-2004 again that's a fairly large deviation so you can even look at that I mean we could have a look at that right now it's okay what is it about C 3 and a 2 that is is there something I haven't looked at this before so I'm kind of curious if it makes any sense but do things according to a plan and inquire about other's well-being I don't know nothing's jumping out of me so maybe there's some logic to the other option is to so that that could potentially be a correlated residual relate to the model another option is to add modification indices we'll start to look at modifications so these are essentially tell you if you add this parameter it will improve your model by this amount so the function is called modification indices and you just give it a sip and you get a huge list of all the possible modifications you could make so for example you could add cross loadings you could load item c1 on agreeableness or c200 revenues so a3 on conscientiousness is all the possible cross lighting's there's all the possible correlative residuals so you could correlate a3 with a1 or what was their idea a - it's a - with c3 you can please moderately large in this case usually their theory constraints which say what would be inappropriate in particular correlative residuals within a factor often said it's more legitimate group oh yeah you're talking to have two agreeable news items that are very similar as opposed to the normal level of similarity as a general kind of point of practice these people usually add modification indices if they bother threshold so they can't want to look for the particularly large ones and by default it seems like it's ordered by you know the variable name and maybe the operator or something so it can be nice to order it by which ones are the biggest modification indices so you'll see here this essentially is a way of order a ordering it by the modification indices you need to decrease your order so now we have the biggest modifications at the top so there's a potential one jamala added so eighty-three and III supposedly are candidates or e1 and e2 so let me give it one more try and see if it actually makes an exam some so use usually these big ones you'll be able to look at them and you'll say I add that that makes a lot of sense so here you want to need to don't talk a lot quite difficult to approach others know how to captivate peak so in this case maybe it's picking up on some method effect related to reversed items to negative extraversion items for slightly more similar than the other third one that was not and then you can include within ahead which will just give you the first sake ten modification indices or you know you could um extract just the ones above about ten you know so yeah there's all sorts of ways doing that and then you could then add the anything you felt was theoretically appropriate and to your models and a final additional model so for example you could basically use the procedure up here to um we could call it model model three and fit three and we could add I don't know maybe a one and A two and make sure the effect is right and then somewhere we will rerun that fit line that get that get all nice and pretty and sure enough fit three has improved the chi-square hopefully it should be once it should be 172 minus 132 should equal the modification indices at least if you only only one additional term I think that's what it should be equal to the we would see I guess and yes that's led to a quite a substantial improvement in CSI and I'm is a final thing you might want to do is actually save the latent variables so if you wanted to use this latent agreeableness variable add it to your data set and do some other stuff you can use the predict method so that's essentially giving you a data frame with with all those latent variables so that was agreeable most punches next so that's that's people score on that and you could then add that to your data set so you can bind those columns using - yes you've got C cases is your original data set and then your buddy's new save scores and you could create new go.you data sets or pads now that now they're added in and you could do your normal analyses using you know normal analysis you get that and the correlation between I don't live with the plot anywhere so let's say how you get a correlation it's called ignite that's not a beautiful plot be crass but the point is you can use those variables for normal analysis I'll just quickly show you SEM imma go spend much time on that but if you wanted to actually predict something let's maybe try and predict these latent variables agreeableness conscientiousness and extraversion from IH and gender so I've just made a a gender a numeric variable where one is silent zero is female so what we have here is we have the usual stuff three latent variables agreeableness conscientiousness extraversion and now we've got what what makes about here now is we've got this single tilde we're going to predict those three variables from age and yet normally probably if you bother with this in probably have one light and variable predicting another and you can do all that so then I just saw this will be a simple example instead of using CFA we use SEM can we give it the model string today and yeah we can look at the unstandardized parameter estimates this is showing just the regression ones and so we can I'm not entirely sure of the metric of these these variables so i unstandardized this little kind of fuzzy but it's certainly showing significant effects across the board and i guess we could use our standardized and you could see that one standard deviation increased in in age whatever that using this data set leads to a slight increase in conscientiousness point for most point one and standard deviation males are probably want to get the metrics right the ones that negation increase on genders probably a bit like moving from female to male step yes start with conscientious old people sex gradable males more agreeable old people that's extroverted male song donor finished okay a final thing i'll show you can i haven't played around with this much but it seems like if you're used to the nice graphical diagrams of amos there is a way of getting that with package called simpler and maybe make this as big as we can so there is a way of getting something like that that shows you know age predicting the three latent variables the indicators and if you zoom in really careful you can actually see the numbers and I guess things with something to do is with their negative or positive loadings and whether they're significant or above a threshold or something do you notice I haven't played around with it too much but yeah she's quite nice
Info
Channel: Jeromy Anglim
Views: 12,978
Rating: 4.9359999 out of 5
Keywords: statistics, rstats, melbourne, lavaan, sem, r (programming language), cfa
Id: kCXN7CRYKVo
Channel Id: undefined
Length: 45min 31sec (2731 seconds)
Published: Tue Jul 04 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.