BookmarkSubscribeRSS Feed
bollibompa
Quartz | Level 8

Hi,

 

I am imputing missing data using PROC MI and the fcs procedure.

My dataset contains 46 variables and 10 (9 continous and 1 binary) of them have missing data which I am trying to impute using FCS.

I am using logistic, discim and regression and I specify  min and max values for each imputed variable.

 

The procedure works fine for 9 out of 10 variables but when I add the last variable to impute I recieve an error message "An imputed variable is not in the specified range after 100 tries". I have checked the specified min/max-values for that varaiable and they are ok.

However, if I increase/decrease the max-value the process works. I think PROC MI dont want to impute values within my range beacuse it somehow think is is to narrow.

Anyone have experience of this?

 

Thanks

Thomas

 

7 REPLIES 7
Rick_SAS
SAS Super FREQ
Could you please post the code that doesn't work?
SteveDenham
Jade | Level 19

Also, please tell us something about the variables that you are trying to impute, and in particular about the ranges you want to the values to fall into.

 

Steve Denham

bollibompa
Quartz | Level 8

Thanks for reply!

 

Provide the PROC MI code:


proc mi data=have out=want nimpute=1 seed=2015
min=. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
37 90 45 75 153 3.3 134 0 0.12 0.05 .

max=. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .;

class co_var1 co_var2 co_var5-co_var10 co_var15-co_var32 co_var36-co_var40 co_var51;
transform log (co_var49 co_var44-co_var48 co_var50); /*transforming skewed variables*/
fcs discrim (co_var38-co_var39 co_var51 co_var37/classeffects=include) nbiter=100;  /*binary varaibles*/
fcs logistic (co_var40/order=formatted);
fcs reg (co_var41-co_var50); /*continous variables I want to impute (mainly laboratory information)*/

var co_var1-co_var51;
run;

 

The code above now works, but that because I put a zero instead of the min-value 53 in the min-statement (the zero marked in blod). When I instead put in 53 (which I want) I recieve the error message:

 

"ERROR: An imputed variable value is not in the specified range after 100 tries."

 

Thanks again for help!

/Thomas

SteveDenham
Jade | Level 19

Hmm, so 0 works and 53 doesn't.  When you look at the values obtained through imputation for this variable (and I'm not sure which one it is), what sort of imputed values do you get?

 

I get the feeling that there is something like quasi-separation going on, so that all the predicted values come out lower than expected.

 

Steve Denham

bollibompa
Quartz | Level 8

Thanks,

This variable is laboratory information. It seem that SAS think that my min-max interval is to narrow because if I set it to 0 it will impute values down to 20 (so it think that 53 is too high. However, 53 is the lowest observed value for that variable)

 

I also have another continous variable with the same issue, but here PROC MI instead impute very high values. The range for this variable is 0-6 but PROC MI impute values up to 35 (without any min max set)

 

 

/Thomas

SteveDenham
Jade | Level 19

Well, if you have reason to believe that 53 is the absolutely lowest value that could be observed for that variable, then you could set all values less than 53 to 53.

 

However, it appears that predicting the value from the others in the study seems to indicate that the missing values are associated with predictors that, when factored together, result in a predicted value that is less than your limit.  Now comes the art of multiple imputation--are those imputed values reasonable in a sampling universe represented by your complete cases?  If so, then the limit you have set is arbitrarily too high.  If not, then truncate the distribution at the cutoff.  

 

And it becomes a mixture problem...

 

Steve Denham

bollibompa
Quartz | Level 8

Thanks for your input!

These is something strange with this variable so I need to think of another way to handle this

/Thomas

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 4289 views
  • 0 likes
  • 3 in conversation