BookmarkSubscribeRSS Feed
Lukas19
Calcite | Level 5

Hi,

I have looked into a lot of documentation from sas.com and papers from other authors but I don't seem to find how to use the minimum statement for the FCS statement part of proc MI.

More specifically, I have a data set of continuous and discrete variables. I would like to use the fcs statement of proc mi to replace the missing values. In doing this, I want to use

logistic regression for the classification and the regular regression for the imputation of these continuous variables. Because regression would let to negative imputed values, I want to

impose a minimum of 0. I have no idea where to put the minimum statement in the syntax as mentioned on SAS/STAT(R) 9.3 User's Guide

I have tried several places to put the statement without success.

Could anyone help me out with this?

Kind regards

4 REPLIES 4
SteveDenham
Jade | Level 19

It appears that you would have something like:

proc mi data=yourdata minimum=0;

This is as per the SAS/STAT12.3 (SAS 9.4) documentation.

However, even given all of that, the regression method assumes that the data are Gaussian.  This may not be the case for your data.  An easy way to avoid values less than zero is to log transform the continuous variables prior to imputation, and then backtransforming everything after imputation.

Steve Denham

Lukas19
Calcite | Level 5

Hi,

Thank you very much for your answer. I have considered that assumption about the data that have to be Gaussian.

However, then I'm wondering which other multiple imputation method I could use given my non-monotone missing pattern.

Do you have any idea if the assumption is a requirement or if it just lowers the power of subsequent analysis?

Kind regards

SteveDenham
Jade | Level 19

Well, let's look at the continuous variables--what are they?  How are they defined or measured?  Suppose the data were blood levels of some metabolite.  It would be a natural assumption that they have a lognormal distribution.  Thus, if I were imputing using PROC MI, I would transform all of the measurements by taking the log of the value.  Missing is still missing, but I could now use the regression method to impute the log of the missing values.  Does that make sense?

So I guess it comes down to what those continuous variables are.

Another entirely different approach is to not impute at all, but use maximum likelihood methods for your estimations.  Provided the data are at least MAR, these estimates will be asymptotically unbiased.

Steve Denham

ohcomeon
Fluorite | Level 6

The cure can be worse than the disease here. Setting a minimum can introduce biases (and slow runtime). Log-transformation can distort the relationship between variables, and there is no guarantee that the logged variables will be any closer to normal than the original variables. Often the best approach is just to hold your nose and impute non-normal variable as though they were normal. It's not optimal -- and hopefully better options will become available through PROC MI -- but often the results are not too bad.

 

I have written extensively about this issue in this paper:

 https://pdfs.semanticscholar.org/74b4/a31619809c99866760109e00c34ba8830728.pdf

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 2159 views
  • 6 likes
  • 3 in conversation