Programming the statistical procedures from SAS

Proc MI

Reply
N/A
Posts: 0

Proc MI

Hi,

I am trying to see if imputing the data using proc mi is a good option.

I have 8 numeric variables .. out of which 6 var have less than 5% of missing values

out of remaining 2 vars 1 has almost 50% missing values and the other has 18% missing values. Is it advisable to use proc mi to impute values for a var with more than 50% missing values?

Thanks,

L
Trusted Advisor
Posts: 2,114

Re: Proc MI

Lisa,

I don't think that there is a threshold. You do need to consider "why" you have so many missing variables in those two variables. If there is a reason that does not go into the MAR or MCAR categories, then you may be better to explicitly model "missingness" with an indicator variable.

Doc Muhlbaier
Duke
N/A
Posts: 0

Re: Proc MI

I am working on survey data and the reason some of the values are missing is that :

1. The person skipped that question
2. The question was not applicable to him.
Trusted Advisor
Posts: 2,114

Re: Proc MI

Personally, I would be concerned that either of them fit the PROC MI assumptions. Skipping can be because the person didn't see it or something else extraneous (MCAR), but can also be because the person found the question intrusive (income, for instance would not be appropriate for PROC MI). Not Applicable is definitely an answer that needs to be modeled, using PROC MI to impute another value is going to bias the results.
N/A
Posts: 0

Re: Proc MI

I really don't want to loose an observation..

So to do that I am currently imputing data by placing mean values at places where there was missing data or values which were not applicable or cust didn't have experience in those ...

so is using proc mi a better option then that?
Trusted Advisor
Posts: 2,114

Re: Proc MI

There is a fair amount of literature to indicate that mean substitution is one of the worst methods of imputation. PROC MI may well be better even with the assumptions violated.

You may want to consider a combination approach. Explicitly model missings for the two with the most missing (that adds two indicator variables to the model) and using PROC MI for the other 6.

If you reach different conclusions with listwise deletion, mean substitution, and PROC MI, then you need to look further into the mechanisms to understand the story your data are trying to tell..
N/A
Posts: 0

Re: Proc MI

I tried running proc MI on all vars I get results that are approximately close to the results when I don't use any imputation.

In case of mean substituition are not as close as proc mi ..

How should I compare which one performs better??

Thanks,

L
Ask a Question
Discussion stats
  • 6 replies
  • 183 views
  • 0 likes
  • 2 in conversation