BookmarkSubscribeRSS Feed
liloit
Calcite | Level 5

 

Dear All, 

 

I need to validate a new questionnaire made of 62 binary items. I have to perform EFA, defining reliability and implementing CFA.

Unfortunately, I got a lot of missing values (80% of subjects have at least one missing answer) and I don't know how to deal with them. i don't think a listwise apporach could be a solution here, but should I impute those values before proceeding with EFA? If so, do you have any suggestions on which method(s) should I use in SAS?

 

 

Many thanks in advance for your time and help,

Giulia

4 REPLIES 4
ballardw
Super User

EFA? CFA?   TLA's (Three letter acronyms) are not always known to everyone and often are jargon to a specific field so it may help to describe what these activities entail.

 

Are you sure that some, if not all, of the missing values should not be missing? Many surveys involve skip patterns where the response to one question determines whether or not the respondent should be asked other questions. Example: Male respondents should not generally be asked about female targeted products or gender specific health issues.

liloit
Calcite | Level 5

I'm sorry, I was referring to Exploratory Factorial Analysis (EFA) and Confirmatory Factorial Analysis (CFA).

Unfortunately it isn't the case of this survey, all the questions should have been answered (i.e. there aren't skip patterns).

ballardw
Super User

Is thare a pattern to the missing such that one or two questions account for the majority of the missings or are the scattered pretty much across all of the variables?

If a couple of variables account for the missing you might consider the analyis without them. There may also be a sytemic reason for just a few to be missing such as a very poorly phrased question "Have you stopped beating your wife yet?" or asking for a yes/no answer when the question asked (possibly by implication) to consider more than two valid answers.

 

And by any chance, are you looking at recoded data that reduced a multiple response down to two categories? Possibly the original question had yes, no, I don't know and refused to answer as responses and only the yes/no are what you see. Then you might actually consider a different coding /recoding scheme or analysis.

 

Do you have any characteristics of the respondents that might tend to make them similar such as age, race, gender, location, activity to make groups? If so one approach may be to pick the most frequent response within the group for the variable to impute the missing. Or randomly assign a value with probability equal to the proportion of responses to the non-missing within a group.

 

But I would say in a very general term that the "questionaire validation" is a failure as the first stage: answer all the questions failed.

 

 

liloit
Calcite | Level 5

Thanks for your suggestions.

Unfortunately I previously checked and there's no pattern to the missing, they are pretty much scattered along each question.

 

Regarding your second question, actually there aren't different type of missing, let's say, but just only one to take into consideration.

 

Today while browsing, I found the paper attached here and I was thinking to apply a single imputation stochastic logistic regression using PROC MI.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1336 views
  • 1 like
  • 2 in conversation