R - Confirmatory Factor Analysis Examples

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
alright so in this video what we're gonna do is cover the examples from Cu the CFA basics and it starts in the book around page 39 and then works its way through there so you'll want to look at that to be able to understand what's going on but I also have the our code uploaded that I'm going to walk through in this set of PowerPoint slides so the first thing we're gonna try something very simple one factor CFA and so what that has is I it's an IQ example so we have G here for our generalized IQ and then the subscales of the whisk so we're gonna look at how to estimate a simple CFA and then make it more complicated so all of this examples going to use the same data and we're just going to try multiple models on that data set so the first thing we want to do is import the data set now we did this last time with the matrix lower to full function and so now we're going to talk about how to go from a correlation table to a covariance table because if you have the information always go to covariance so you can use the unstandardized solution if you desire or if you have the entire data don't worry about it but we're gonna have a new little function here that will convert correlation tables to covariance tables but i have to have a correlation matrix and the standard deviations so let's open that our file a little bigger so you can see it first thing i want to do is load the Livan library and the Simplot library just always do those at the beginning so that i can make sure that they are on when I'm doing my examples so the doc decided to like my function so it was a little disturbing alright so the first thing you want to do is import those data set the data so let me do that here I've got my triangle the way it looks to make sure I've got each column lined up appropriately I've only entered the lower half let's run those this is a correlation table click here and I've completed my correlation table for me next thing I want to do is enter the standard deviations and so those standard deviations need to be in the same order as the table so I think this table is on page 43 and you'll see that it has the correlation table this piece here and then the very bottom it has a standard deviation table I'm sorry this is the row of standard deviations those should be in the same order okay so that's where I got those from and then if you're not familiar this is a vector function so I can't really click on it up here but I chose them as a one serve row or column of data so import of the data the next thing we need to do is name everything we're still using call names and row names to name the matrices and then the new function marine is just names some names is for a vector where I'm giving each one little piece of information a name rather than each row or each column a name and you have to name both so I've got column names equals row names and I guess we actually could just do equals names out here since they're all in the same order but I tried to keep it consistent with the way that we've been doing naming things so I named them all so let's look at that so now I've got names out here and I used a little dot okay on names really do not use spaces and this is also true of what you name your data over here don't have spaces just like anywhere because it'll make your life difficult the only time you really can use spaces as in labels and we're not going to use those very much this semester so ditch the spaces dots work really well underscores work great or just short letter combinations but no spaces right now I want to use names and so what that did is for the standard deviation thing it gave them names let me type it down here so whisk for SD and you'll see that those standard deviations each have a name and that's really how we need it to be laid out to be able to do this next part all right so it's just an application of call names and row names to vectors now we're going to convert to a covariance matrix from correlation so its core to Cove so correlation to covariance first thing you put in the name of your correlation matrix and then put a name of your s standard deviation vector in core to Cove so this is Wis for that Cove because now I'm changing a tool from a covariance correlation the covariance table and my correlations standard deviations so let's look at our correlation table first so you have the ones on the diagonal everything is in less than one because it's a correlation table our covariance table that's different so these are the variances and then here are those covariance is here so just sort of D standardized it alright so we've got our data imported we've got it named the next thing you always want to do is define the model so here's some new stuff before what we did was we decided to do y is approximated by X so Y tilde X and then we did that because the variables were all manifest they're all measured so they're in the data set it knows what to do with them and we were just predicting a Y variable but now this latent variable isn't a column in the data set or a column row thing and a covariance table so now I have to tell LaVon make me a variable so this equals tilde symbol says make this variable and define it by here are the items so we're telling it that it's a latent variable to create that variable and we're using a reflexive or reflective format so this equals tilde is going to give us where the latent variable predicts items formative items are made in a different way and we'll do those some a little bit later so let's do that with this picture so I let me back up it's pretty easy we're gonna say G equals information similarities word something matrix reasoning and I don't remember all of them pictures we're reasoning matrix reasoning picture concepts it's actually a really fairly easy model to build and so let me show you that here I'm hit enter right here just so you can see all of it at once do not do that on yours so I have G is approximated by or no I'm sorry G create so equals tilde and that was gonna create it where the G variable predicts these items information plus similarities plus word reasoning plus matrix reasoning picture concepts all that needs to be on one line we're just trying to show you that they're just all added together so another thing is that the book continues to use this at some point he drops it I can't remember when but a times B times thing all that does is give them labels it's not necessary so we can actually leave that out just as a warning because when we get into some of these complicated models for the class assignments you'll see that I won't do that so it's not required I'm just trying to show you what the book has so let me build that model so it is just one line that G equals all these items added together now let's try running that model so we're gonna use the CFA function CFA and sim are basically the same thing that has more CFA specific options and honestly at some point they're basically the same but so you could use sim here as well we're not going to do anything I don't think that would preclude us from using but either function I tend to try to use CFA if I'm doing a CFA so I can remember that that was the goal of the analysis but one more option and this is specific to the running the model and not the summary of the model there's this standardized on latent variable so STD that LV so standardized by latent variable then we're gonna leave that off for the majority of the things we're gonna do except for the examples where I'm gonna show you how this works and the reason why and this would just be much clearer when you look at the output is that by setting it to true in the summary function so getting the output not running the model it'll show you the standardized on latent and the unstandardized solution so you get to see all of them if you standardize it in the running the model part the CFA level so you see there's two places to do standardization so if you standardize here at running the model you can't see the under standardized solution so I much prefer running it unstandardized and then asking for both solutions in the output because then I can see all of them more information rather than less so we're gonna use standardized Livan equals false that's actually the default so this should be a lowercase s false is the default so you'll often see in the book he's just left it out because he it isn't necessary to include but if I leave it as false it sets our scaling variable or a marker variable as whichever one he lists at first so I go back over here and look at my model I listed information first because the order doesn't really matter and that one is the one that's gonna get scaled on so if you wanted to switch it you could just reorder the question items in the model code if you set it to true it sets the Leighton is the indicator and it estimates all the loadings so you could get p-values for them but constraints the variance to one so I guess one of the good things about this is the you will get p-values for them but I'm going to show you how you can interpret the output of the unstandardized a standardized solution to determine if that's really necessary so let's run that model and then let's get the output so my model here so I've got model equals in the model named sample Cove equals the co the I don't know that we've done model equals with model name but it runs the same if I just leave that part out so I could delete that and it would run fine so let's see here with a model our covariance table that we're using we still have to tell it the number of observations we have cuz this isn't complete data and then I entered standardized equals false you can also just do F for false as well but I'll mostly try to fill it out but you don't have to type the whole word false if you're bad at spelling false like I am I right Flass a lot so alright so I have it saved let's get the output so we're going to do the same three outputs that we did before standardized equals true to look at the standardized solutions the new one we're adding so got R squared and fit measures and we're gonna get three types of standardized output let's pull it up first I'm gonna zoom out just a little so you can see all of it together okay that might be too small if i zoom all the way out all right that's too tiny let's just make it bigger so you're gonna get three of them so what happens here I'm our latent variables and there here's G created by R is predicting these five manifest variables so you can tell in it marker the first one and since it did that there's nothing here so it does not estimate p-values for it this estimate is in the scale of the data so this is the unstandardized solution so I can tell if they're all do they all have the same scale if they do then I could interpret these directly however if they didn't what I could do is come over here to standardized LaVon I'm sorry not LaVon latent variable so this one is standardized on the latent variable if I scroll down here under variances you'll see that the latent variable is one here so that's the same thing as setting this marker variable is one so by doing that it does estimate this one now when it estimates this one I can see that if it is a good variable or not this one here is significant at two point five and this one's bigger so I don't know that I need to look up the dizzy score for this particular one I could by switching the order of the model but I don't know if I really need to because it's within the same range of sizes as the other ones standardized all out here so I've got standardized on the latent so that it's like beta so these are Z scored so they do get bigger than one standardized all out here standardizes the latent and the items now I really like this because this to me is more EFA so I can see which item is the strongest and I can use that to think about which loadings are what we would normally consider good for an EFA so if this drops below 0.3 that's a good sign to me that an item is not doing what it should be doing it might still be significant because standard errors tend to be very small in these types of models but it might not be at a level I would like it to be at so 0.3 is a medium effect size for correlations between items and their Layton's and that's essentially what this is is the correlation between the item and its latent and so it might be significant but not practically important all of these look pretty nice and strong with information being the strongest one so I got three different things the on stem truely on standardized solution the standardized solution on the latent okay this one's scaled this one's standardized on the latent this was standard on standardized on everything and you also get all of that for your variances right between items and then we don't have any correlations or covariances in this one to look at all right so standardize on the latent variable but it leaves the manifest in the scale so it's Z scored and then standardized is everything just kind of a summary so we had some extra functions in the last lecture we looked at fit measures to look at all the fit indices not just the ones it shows you in the output automatically and there are similarly some more we can look out here let's start with parameter estimates so parameter estimates gives you the estimated values and the confidence intervals so right now I just have the estimated values they're z-scores and if I asked for the standardized solution so I got it what can also do so I am on line 46 here is a look at the parameter estimates more information about them this isn't the easiest thing to read and such a large blown-up kind of way if I can roll it out here that's a little better so what happens is it tells me okay for parameter one that is G's relationship with information I gave it a label by using a times that and this one isn't estimated so this e has some N ace here because that is the one that it's scaled on so that was not super useful but for the second one here it gave me an estimate standard error the z-score the p-value and then the upper and lower versions of the estimate in the unstandardized solution if I were to standardize this solution I could get the confidence interval for the standardized solution then it shows me the standardized solution and because I asked for it and you'll notice these two last things are the same then it does the same for every item including the variances so what you'll see here and until you get used to lavon's code it looks a little weird but information tilde tilde information it's a little strange that's the variance for information so this is the way the code is written we get two modification indices you'll see this again but this is information correlated with itself or the variance for it and you can actually get the confidence interval for the variance which seems a little odd because I told it what the variance was but remember this is an iterative process so this is an estimate of the variance you know if I were to do this over again so some other new functions that we can look at is fitted the fitted function gives me the reproduced covariance table or the recreated one so I have the coral the covariance table I gave it so wisk board octo so here's the real covariance table the fitted covariance table just gives me what happened when I tried to build this model in the same way so I've got 906 904 pretty close 6 5 5 6 5 6 so this is the real one that we entered this here at the bottom is the one that I came up with so you'll see if you don't really get the idea of estimation they're not the same they're close because this is probably a good model but they're not quite the same so when I've been talking with fit indices about taking that covariance table the real one and subtracting reproduced one and then that's how chi-square and every fit indices estimated this is how its subtracting so it literally takes them into practice them but if you don't want to do that you can also run residuals on the residuals table is is this subtraction between the two okay so it subtracts the difference between those two and shows you which items maybe aren't estimated in the best way and another lecture will talk about how to look at these and know which ones are significantly bad but for right now I can just look at which one's the largest so the relationship between matrix reasoning and picture concepts is the the largest discrepancy and what that means is that I'm having a hard time because they're both on the same factor understanding the variance between those two given that they're on the same factor and that probably is because if I look at the estimates for those they're on the lower end stronger estimates usually indicate or stronger if I look at the standardized solution sorry here I'm on line four and five larger estimates there usually are better are like we're better at at getting those correct but we'll come back to this residuals table and talk about how you can tell if their areas in the model that are significantly bad but right now you can just kind of look at it we talked to last time about fit measures so I can look at all the different fit indices even the ones that I looked at before but then something new is modification indices we're talk about these a lot and here I really want to look at one of those mean so let me pull up one and I didn't talk about this too much in the like lecture component because I don't know if they make a whole lot of sense into you like looking at them so we're gonna run this line 53-year make this bigger and so what's happening well what it does is it gives you a a list of things that are significant changes to the model if you were to add this path so if you were to add literally this code right here to the model here would be a significant change actually on this one it looks to me like it's giving you non significant changes too so in a different section we'll talk about how do I filter out all the ones that don't matter so it's gonna give me every change possible I'll write to you and then it gives you a modification end to see here so this is left-hand side operation right-hand side that's what those stand for and that essentially means like what code would I add to my model definition to put in this path so this is a correlation between information and similarities so double tilde is mean variance and so in this case it's a correlation between error terms for those two variances I'm sorry those two manifest variables and that would increase our chi-square by 0.01 so in my here stands for modification in to see a modification to see is that the decrease in chi-square cuz remember we want chi-square to be smaller if I added this path not if I change where one item went like effect item one went from factor one a factor two it's the addition of pass so sometimes if you try to stick with theory and use modification indices it doesn't really work and then this would be the change expected change in the loadings coefficients standardized Livan standardized all coefficients but generally we're looking here at modification and to see for the largest one and there are ways to sort it and we will use these this semester all right now if there's only a small number of them so I'm gonna just kind of flip through it here information the picture concepts has a significant modification inosine okay significance 3.84 because 3.84 is a with one degree of freedom is a significant change in chi-square so if we did our chi-square change statistic it would be a significant change this is eight point nine three and here's fourteen point one five so later we'll talk about how to sort these but the biggest one unsurprisingly is the one with the largest residual so very large residuals it's gonna suggest you do something about those so they're tied together it's saying that we should include a correlated error between matrix reasoning and picture concepts and that will decrease our chi-square by fourteen points that's a significant change but I don't know if that's a useful change so with modification indices you have to be careful that you're doing things that are theoretically meaningful and that makes sense and that might make sense given the the test I think matrix reasoning includes pictures so maybe that makes sense but I don't know that I really need to add it because the fit I haven't looked at the fit indices because I just kind of said here they are but I bet the fit is not too bad in this model generally you're looking at modification to sees when the fit is poor or maybe items aren't loading correctly so let's go back and whoa here the fit measures right here but if we look at the whole output see what we got here right so our chi-square statistic is significant big deal but our see if I and TL I are already very high are r MC is 0.09 so it's okay it's acceptable and our SRA bars are always really good as well so I don't know that this model really needs any modifications especially because it's for a simple model and that modification is it going to help me so much that I need it the other thing we want to look at that we haven't talked about is the r-squared values so we are predicting several questions very well and picture concepts is the one we're predicting the least well that's also tied to the estimates up here so bigger estimates are predicted better all right so we're Matt okay let's draw that picture so we're gonna use that same path function and the tree layout works really great for CFA's let's do okay we're going to draw the picture I'm gonna move this back over because otherwise it will not be happy with me or not having enough space down here this corner so here's my picture now it isn't as far as I can tell Impossibles make the loading work like numbers any larger so you will have to squint at yours but you'll see out here the variance has the numbers so six point six five I've got my dashed line here for mine on estimated parameter I've got my variances for each item those are air variances and the loadings for each one if I want to see the standardized solution which i think is a little more common in pictures what happens is the G up here it turns into a 1 because that standardizes on the latent variable and then I get estimates for each one and then the the standardized error down here so these are standard deviations because you'll notice system not match the standard errors out here okay because those are the errors for I'm sorry estimates for variances and standardized which would be standard deviation okay so from there what we're gonna do is take that exact example and estimate by standardizing on the latent and talk about how the app puts a little different but first let me just go through some notes here so we interpreted our parameter estimates we saw that picture concept is probably not being estimated super well but it's still within the range of what I would say is a pretty good effect size we looked at those fit indices they look pretty good room seems a little high we talked about residuals and modification indices and r-squared really should add R squared here too all right so some things that we haven't mentioned and things that we have mentioned we haven't talked about hey with cases again so hey what cases are making sure that the variance this the solution is logical this is a really we should start but what it is is we want to make sure the variance is positive and the squared multiple coalition's are R squared or less than one and so our skirt here all less than one great variances under here all positive great so we not have a Haywood case it does warn you when you get a Haywood case but sometimes because the output is so long you'll miss it and so always come back up here to where it ran to see if you get a a warning message because it will still give you a solution even if you have a haver case and if you just leave it that's an inappropriate solution so it'll still mathematically give you a number even though it shouldn't are there any crazy standard errors so that's another thing to look for so by that I mean do they are they all approximately the same for each item manifest variables tend to have very small standard errors because we have very large sample sizes normally variances are a little bit bigger usually so it kind of depends on the scale of the data but doesn't look fine the largest one is for GE but it's been well within the range or the other ones when things start to be 10 times the other size or if you have these at our 0.05 and some that are 10 that's not good we did look at estimates and we didn't look at model thing now with the parameters we want our parameters to make sense so we can check out those standardized parameters those are comparable to EFA if you look at standardized all the z-score is calculated by taking that parameter and divided by standard error and that's often called the critical ratio so you'll see and some other programs that labels it CR or some people abbreviate it CR it's a z-score so it's the ratio of the parameter to the standard error and because standard are so small you usually give very very large a Z values that's why I say sometimes significance isn't the most useful number instead you should look at standardize all to make sure it's an appropriate number for on the type of analysis you're doing because often any parameter that's estimated at point 1 is significant even if that's not useful so with standardized with standard errors that can be tricky so they are based on a scale that is the data you don't want them to be zero because that means there's no variance in the item and so your item has no variance it's easy to predict but it's not a very good item so you always want some variance but also not no variance so backwards so some variance is good no variance it's actually kind of bad you also don't want them to be large because that means that you're estimating poorly so we don't want them to be effectively 0 and we don't want them to be super large and then when we talked about model fit but quick reminder we want a non significant chi-square for rim CSR mr we want small numbers see if int oh I would want large numbers and we met the good criteria for CF I and until I actually excellent at 0.95 rim C was in the ok range SRM R was in the excellent range as well and then with modification indices you really want to focus on what they mean so they theoretically useful now in some of these example assignments I just tell you to add some see what happens because you're just learning how to do this but if it's your own model definitely making sure that they make sense and it's a story you can sell and then usually in a CFA it's meant to test a specific picture so it doesn't really necessarily mean that you should add these start to things but sometimes correlated errors make sense and sometimes they're so correlated that it's a good indication that maybe one of those items needs to stay the other one needs to go because those items are effectively the same thing so I guess what modification and sees as sort of a low-pass to make sure that the thing is active funky so especially correlated error terms so if I have two items that are just two correlated I might consider removing one of them now an over fitted bottle can happen when you add parameters that help model fit so you're adding parameters to make the a fit indices better but it doesn't really help theoretically and it probably won't replicate because that's just a quirk of your data and so you have to be careful when you add modification indices because that it's a lot like fishing so it seems like cheating all right now when we we're gonna build three more models from this data set and then we'll use the ANOVA function to compare them because they are nested we can remember to look at differences in CF I if the change is greater than 0.01 okay lower is worse and then we can look at AI C or EC VI and lower is better so what I'm going to do is go to the very end of the notes here and type in the fit indices for this data so the REM C was 0.089 sorry miles but no three six I think it measures so room C 809 Oh 8 9 s RM our point O 3 4 it's close see Fi there's the ECB ions copy out one oops I didn't include those on my table sure let's do that now I can spell there we go yeah I see so our a I see so many letters not enough time oh my goodness that's huge so I don't know how to interpret that number but we will when comparing models so see if I lose I'm 80 here's T CF is 0.98 0 and it is common to use three decimals and it sort of thing since they don't go over one then our chi-square and 5 degrees of freedom alright so let's switch to model 2 here so for example 2 we're gonna do the exact same model but now switch to standardized LaVon latent variable it's really late and variable just wrong thing in my head standardize on the latent and see what that changes in our output and basically it changes the estimate column and the standardized on the latent variable column are now the same you will notice the standard errors are different because now it's estimating standard errors for the standardized on the latent and you will see it you get all of the path scores so this is a way to look at all of the p-values if that's more what you're interested in so let me see here so I haven't I don't have to change the model any because I'm not changing the structure instead now I'm just changing this word right here to true and I'm just estimating it in a different way now in my output you'll see estimate and standardized on the lighten are the same but now I have estimates z-score estimates for all of them and it is significant standardized all will not change from the previous one because it's still the same model I had to get estimates for all the variances except GE this time and then I our skirts are the same so the fit indices and everything for this model should be Robin should be exactly the same but now our parameter estimates it's gonna give me the upper and lower here for the standardized estimate so I can look at my confidence interval for standardized as opposed to the confidence interval for in the unstandardized solution so to me that's the only real reason to use this code where I standardized in the model and not just in the summary is so that I can see a different confidence interval and p-values for all of them that's the advantage for standardizing in the model I personally don't ever do that I personally don't ever do that it has its moments but I don't tend to use it that way all the fit statistics should stay the same but just in case you weren't sure we can look at the fitted correlation table the residual correlation table fit measures so here's AC VI again room C point eight nine this is the exact same model copy paste so model one and model two are the same with my modification indices I should get the same picture and I do cool now the picture here is gonna change just a little bit oh it's actually not because I showed you how to do the standardized one before but now you'll see that it's got the little dotted line up here versus the previous one had the dotted line out here because this is where it was scaled now that I line changes over here because that's where it's scaled so we see those dotted lines out there that means it's scaled on that variable and that's really just to show you the difference between both of those types of estimation now examples three and four are what we're gonna do is first estimate a two factor model instead of the one factor model and then estimate a fully structural model even though we aren't quite there lecture wise it's in this chapter so kind of a preview of a chapter ahead but we're gonna talk a little bit about how to run that type of model so I'm gonna specify two factor model I already created the picture for you here where I've got verbal and fluid reasoning or intelligences and so this is the separation but you'll see here that I have only two items on this variable and so that's kind of cutting it close so verbal and fluid IQ same data set the first thing we want to do is define that model and we don't remember have to use those labels but because he did here's an explanation so the tilde is y is approximated by X right the equals tilde is the definition of our latent variables and two tilde is is a covariance or correlation it depending on if you're looking at the standardized or unstandardized solution so tilde tilde is variance for a single item so if I did information till they tilde information that just means the variance with itself if I did information tilde tilde pictures that would be the co relationship between them let's look at all that so I've defined the model here so have verbal equals in Meishan plus similarities plus word reasoning fluid equals matrix reasoning plus picture concepts and then we labeled the covariance here but really unless you want to label it you don't need to put it in there so I left it in there without the label it would be like really it gives me a little warning says really you don't have to do that okay I remember exactly what it says but it's something like that but here he's we've labeled them all but this is also useful if you wanted to set it to a specific number so I could make it very specifically zero or if I wanted to set it to point two or something I could do that as well now remember in the lecture notes or at least in the book it talks about if you have only two items for latent there should be set to equal the the two paths should be set to equal and then he didn't even do that so it's actually not necessary but if you are having some problems with your model you can try that we are gonna scale on a marker variable so information and matrix reasoning are gonna be scaled and you'll see that in the output here all right so we've run that we've defined that model we're gonna run it and look at a Sun right here so you'll notice I turned standardization off on the model and let's look at a summary so what do I want to look at in that summary let's look at fit indices parameter estimates R squared and residuals so the first thing I always do is look for Haywood cases so I come down here like in variances they're all equal I'm sorry I'm throw up uh what's the word positive and that's good and my R Squared's are all well they're all below one and you'll see in comparison to our one factor model we do appear to be estimating picture reason picture concepts and matrix reasoning a little better so the R Squared's and those have come up I don't get the estimate here because that's the marker variable but I did turn on standardize all in the output so I can tell they're all very strong and look at these these went up a lot too so these are doing much better by being on a separate factor I do get the covariance between the two so here's the covariance for point two okay mmm-hmm that's neat but ah here's the correlation and then these two are the same because there's only one way to to standardize that particular variable and that's correlation so the correlation between the two is 0.8 that's really high so collapsing them might make sense because this correlation is so high so we would need to see a significantly better model here to justify two factors alright so my variances and they all look fine let's check out our standard errors so those look good they're all kind of small roughly the same the standard errors here are all kind of small roughly the same so we're doing okay so don't appear to have any issues with the model our parameters look appropriate so they're all loading in a way we would expect and then we want to look at here if we wanted to we could look at all of the four different parameter pieces so including the confidence intervals for them but I'm okay I can look at the covariance table that's the original one I'm sorry this is the fitted one it's gonna be approximately the same I can see now you'll see picture concepts and matrix reasoning that was our largest residual before it's now very good and so the residual is here with word reasoning are not so good let's fill in our fit table so here's my CFI Venkata this table TI's point nine NATO Ram C so our MC went down that's good remember we want our room C to be small fine SRM our and also went down we don't have a really good way to say that that's significantly better until we get down here EC VI appears to be smaller than it is look at our any IC and that is also smaller but not by a whole lot let's get chi-square here is that four degrees of freedom su the joys of recording at home and the chi-square does it fits its haft by adding one extra path right and then one extra passes correlation between the two items and we're big the trade-off here is that we have two marker one more variable well we can do now is use the ANOVA function and I don't have this typed in here so you have to type of yourself ANOVA where we can compare the first model so whisk for dot fit to the second model risk for dot - I'm not gonna put in if this is models one and three two is the exact same model just change in standardization so I really just want to look at these two and so I can tell that the chi-square change is fourteen points that's one degree of freedom and that's significant so this is a significant chi-square change and I see a nice Evi or less so that supports us and the CFI actually supports this as well where it went up at least one point so model 3 is a significantly our model the model wanted to all right the last thing we wanted to do is make a picture so here is the unstandardized solution and so you'll see this is not necessarily favored because people like numbers they can interpret and they want to see all the numbers so it's probably more common to use the standardized solution and now I can see that the correlation between them is rather high and I could look at the estimates and sort of an EFA sort of style so I can tell if they're loading the way I would expect them to load now this is a fairly nice looking picture probably still not publishable because people would want to read the squares but the good thing about it is that I can at least see that I have done what I was trying to get at one more example now we're gonna work on what's called a fully latent model and so this is sort of a preview of what's next the next thing we're gonna cover second-order models but then full structural models are after that and so this is copied from before but now since our Leighton's are defined we can actually pick a direction instead of a double-headed arrow by using latent tilde latent so it's gonna be y is approximated by X Y goes on the left but they do have to be defined here first so the order of the lines does matter so what we're gonna do do I make the picture it does not look a pretty the one in the book is much better it's on page not 4950 well we're gonna use fluid intelligence to predict verbal intelligence right and so we're gonna say F predicts y are V so V is gonna be on the Left tilde F because Y goes on the left unless you're defining variables and so if you aren't sure try both see which picture looks right so we've defined V and F first or this will not run and then now the is approximated by F so one tilde F so all we actually have changed from the previous model is that instead of two till days it's 1 now let's work through the same thing we've been doing by running and looking and everything okay so I've run it and I've left standardization off let's look at our summary by changing that from a covariance to a regression and then you scroll up here and show you previous output so here you'll see that it says covariance this is model 3 so it gives you the double tail days and says covariance is here when I change that to a single tilde come all the way to the bottom you'll see now it says regression meaning in the direction is predicted now it's still when you totally standardize it because there's only one it's still the correlation but it does give me an estimate in the scale of the data as opposed to covariance and then this completely standardized part and standardizing on the latent it's the same number so this isn't going to change the model too much because we're predicting a direction but with one one prediction and it stays about the same so you'll notice all these save the same our R Squared's are the same so now we've got an R squared for V so we're predicting that variable pretty well because these are so highly correlated let's see what else all my variances are positive all my R Squared's are less than 1 which is what we want we wanted to look at the confidence intervals we could so I would give me the confidence interval of V def here so that's line six there's line six so it ranges from point eight five to one point two and then it gives me the correlation my covariance table isn't gonna change my residuals are the same as the previous model because all I've done is add a direct path between Layton's my fit statistics are also likely to stay the same because I haven't really changed out of the path or change the path so let's just see if they are real quick I've just picked a direction instead of having it be double headed let me pick two of them here and just make sure they're the same and they are so it's the same model so really the distinction between three and four here is the direction so before we said these two are correlated that's a CFA if we say no this one is first then it predicts this one that's a fully latent model and so my ANOVA function here really is a comparison of either of these models to either of these models and that direction headed arrow I can't tell if this one's any better right so should it be covariance or should it be directional well these are the exact same model statistics so I really can't compare them and then the question just becomes a theoretical one so do you expect it to be correlated or do you expect a specific directionality now if you had multiple directional arrows then they would change but having just one here it doesn't really do a whole lot to the model it changes the interpretation more than the fit indices let's see here for modification indices we'll still get the same ones from the last model I don't think we looked at them though and it really is trying to correlate word reasoning a matrix reasoning because those have the largest error terms but if I look at word reasoning a matrix reasoning those are on different latent very at the moment I know this is the other model but all we've done has changed the direction of this path and if they're on separate factors you don't tend to want to do that because that implies they probably should be on the same factor if you're saying but they're so correlated I need to add this error well why are they on different factors in so we tend to only add correlated errors if they're on the same factor and also though now suggests cross loadings so suggest that F should also be related to the other three variables but none of those modification indices you see here cross a significant line and then when we get to the times that I want you to use these really we'll talk about how do I sort it where I don't even see them unnecessary once now let's make a picture that's kind of a last step and so you can play with what the layout is I tried --try I picked spring because it's the most readable one and trees not so readable as you can tell spring is probably the most readable but remember it's spring circle circle two three entry two there are some of the options and this is probably the most readable I have all of them and it's certainly not publishable but I could tell the direction of the arrow went the way I wanted it to so kind of in some model three and four better than model one and two and we showed you how to do specifically look at outputs for Standardization was kind of a big focus today but also let's look at parameter estimates fit indices modification to see a little bit on hey what cases and so at this point now you can actually do a CFA on your own and look to see if that model fit well the parameters loaded well and then we'll do more later on how to really how to really change and modify problems with models
Info
Channel: Statistics of DOOM
Views: 17,181
Rating: 4.878788 out of 5
Keywords: statistics, r program, rstudio, cfa, confirmatory factor analysis, sem, lavaan, structural equation modeling
Id: eUwaNcnJLfA
Channel Id: undefined
Length: 56min 39sec (3399 seconds)
Published: Thu Jun 02 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.