BookmarkSubscribeRSS Feed
Lindy
Calcite | Level 5

Hi folks,

I am running a proc calis procedure based on several latent factors. However I have a problem. I noticed that during the creation of latent factors using proc factor, the cases with missing values on the items used to create a latent factor could not be used. therefore, originally there were 260 cases, after creating some latent variables based on items with missing values, there only 150 cases to be used in SEM because the latent variables have missing values. 

MY QUESTION:

Should I impute missing values on a latent factor (after I create the latent factor) before running the SEM in proc calis? If so, how to impute missing values of a latent factor?
Or I should first impute the values on the items to be used to create a latent factor, and then after imputation on the values of the items, I use proc factor to create a latent factor so that this factor would not have missing values?

Thank you!

 

4 REPLIES 4
Daniel_Paul
Obsidian | Level 7

Hey Lindy,

 

I am not sure what exactly you are doing? Do you use a two step approach (first estimate factor scores, second use the factor scores as indicators in SEM)? More efficient then the two step approach is a one step approach in SEM with items as indicators for latent variables. Another advantage of this approach is that missing values can be treated by the FIML estimator in PROC CALIS (assuming MAR).

 

Bye, Daniel 

Lindy
Calcite | Level 5

Thank you, Daniel! I am using two-step approach. I am doing a cross-lagged model using SEM with three waves of data. I first created factors based on CFA, and then throw in these factors in SEM. The model is complicated, and using items directly as indicators in these three waves cross-lagged model is very clunky. 

I know we can use proc MI and proc mianalyze in SEM to deal with missing values, so it will not be a big issue if  some factors in a model are with missing values. But proc mianalyze will NEVER give us model fit indices...

For example, I have a data of 500 cases but due to missing values, only 145 cases are used in SEM if I do not use imputation. The output based on 145 cases give me model fit indices such as RMSEA, AIC, chai squar, etc.

But when using proc MI and proc mianalyze, indeed the sample size is 500, but I have no fit indices in the output. I don't think it is proper to use the fit indices from the 145 cases output. What could be a solution?

  

Daniel_Paul
Obsidian | Level 7

Hey Lindy,

 

well, without more details I can not figure out a concret solution, but you should be aware of the following results:

 

  • Performing factor analysis for parceling the item set before SEM can produce highly biased parameter estimates and has an effect on goodnes-of-fit measures in SEM (Bandalos, 2002, 2008; Kim & Hagtvet, 2003). Hence, the aformentioned authors recommend not to parcel (contrary to this see Little, Cunningham, Shahar, Widaman, 2002).
  • Multiple imputation has to be done with the items (and thus before the factor analysis), otherwise factor solution could be biased (Weirich, Haag, Hecht, Böhme, Siegle, Lüdtke, 2014).

Hope this helps a little bit.

 

Bye, Daniel

 

Bandalos, D. L. (2002). The effects of item parceling on goodnes-of-fit and parameter estimate bias in structural equation modeling. Structural equation modeling, 9, 78-102.

Bandalos, D. L. (2008). Is parceling really necessary? A comparison of results from item parceling and categorical variable methodology. Structural equation modeling, 15, 211-240.

Kim, S. & Hagtvet (2003). The impact of misspecified item parceling on representing latent variables in covariance structure modeling: a simulation study. Structural equation modeling, 10, 101-127.

Little, T. D., Cunningham, W. A., Shahar, G. & Widaman, K. F. (2002). To parcel or not to parcel: exploring the question, weighing the merits. Structural equation modeling, 9, 151-173.

Weirich, S., Haag, N., Hecht, M., Böhme, K. Siegle, T. & Lüdtke, O. (2014). Nested multiple imputation in large-scale assessments. Large-scale assessments in education, 2, 1-18. https://largescaleassessmentsineducation.springeropen.com/articles/10.1186/s40536-014-0009-0

Lindy
Calcite | Level 5

Really appreciate your help, Daniel!

It is important to know that researchers should not parcel factor analysis and SEM into two steps. 

--Lindy

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1914 views
  • 0 likes
  • 2 in conversation