Hi folks,
I am running a proc calis procedure based on several latent factors. However I have a problem. I noticed that during the creation of latent factors using proc factor, the cases with missing values on the items used to create a latent factor could not be used. therefore, originally there were 260 cases, after creating some latent variables based on items with missing values, there only 150 cases to be used in SEM because the latent variables have missing values.
MY QUESTION:
Should I impute missing values on a latent factor (after I create the latent factor) before running the SEM in proc calis? If so, how to impute missing values of a latent factor?
Or I should first impute the values on the items to be used to create a latent factor, and then after imputation on the values of the items, I use proc factor to create a latent factor so that this factor would not have missing values?
Thank you!
Hey Lindy,
I am not sure what exactly you are doing? Do you use a two step approach (first estimate factor scores, second use the factor scores as indicators in SEM)? More efficient then the two step approach is a one step approach in SEM with items as indicators for latent variables. Another advantage of this approach is that missing values can be treated by the FIML estimator in PROC CALIS (assuming MAR).
Bye, Daniel
Thank you, Daniel! I am using two-step approach. I am doing a cross-lagged model using SEM with three waves of data. I first created factors based on CFA, and then throw in these factors in SEM. The model is complicated, and using items directly as indicators in these three waves cross-lagged model is very clunky.
I know we can use proc MI and proc mianalyze in SEM to deal with missing values, so it will not be a big issue if some factors in a model are with missing values. But proc mianalyze will NEVER give us model fit indices...
For example, I have a data of 500 cases but due to missing values, only 145 cases are used in SEM if I do not use imputation. The output based on 145 cases give me model fit indices such as RMSEA, AIC, chai squar, etc.
But when using proc MI and proc mianalyze, indeed the sample size is 500, but I have no fit indices in the output. I don't think it is proper to use the fit indices from the 145 cases output. What could be a solution?
Hey Lindy,
well, without more details I can not figure out a concret solution, but you should be aware of the following results:
Hope this helps a little bit.
Bye, Daniel
Bandalos, D. L. (2002). The effects of item parceling on goodnes-of-fit and parameter estimate bias in structural equation modeling. Structural equation modeling, 9, 78-102.
Bandalos, D. L. (2008). Is parceling really necessary? A comparison of results from item parceling and categorical variable methodology. Structural equation modeling, 15, 211-240.
Kim, S. & Hagtvet (2003). The impact of misspecified item parceling on representing latent variables in covariance structure modeling: a simulation study. Structural equation modeling, 10, 101-127.
Little, T. D., Cunningham, W. A., Shahar, G. & Widaman, K. F. (2002). To parcel or not to parcel: exploring the question, weighing the merits. Structural equation modeling, 9, 151-173.
Weirich, S., Haag, N., Hecht, M., Böhme, K. Siegle, T. & Lüdtke, O. (2014). Nested multiple imputation in large-scale assessments. Large-scale assessments in education, 2, 1-18. https://largescaleassessmentsineducation.springeropen.com/articles/10.1186/s40536-014-0009-0
Really appreciate your help, Daniel!
It is important to know that researchers should not parcel factor analysis and SEM into two steps.
--Lindy
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.