BookmarkSubscribeRSS Feed
jrushing_wfu
Calcite | Level 5


I am trying to use multiple imputation in PROC MI to impute hospital diagnoses.  according to the documentation it looks like I want to use the FCS method since diagnosis is a multi-level categorical variable.

when I run my code I get that weird message about floating points.  I am assuming the issue is that there are many levels to this variable.  I tried to lump categories together and reduce the number of levels from about 235 to 40, but the proc still bombed.

anyone here use PROC MI with the FCS statement when trying to impute a categorical variable? I can provide code I'm using if so. Maybe my syntax is wrong.

also, when I run the proc it says that the FCS command is experimental in my release of SAS (9.3).

thanks

Julia

4 REPLIES 4
SteveDenham
Jade | Level 19

I don't know if this will be of any help for your particular situation, but check out https://communities.sas.com/message/145964#145964 .  With a floating point error, I would guess that you have identified the main source of the problem being the number of levels.  From there, it may be a sparsity of data/quasi-separation problem, such that there is insufficient data to get the categorical model (assuming it is logistic) to converge, and thus the likelihood function wanders off into places in the manifold where it overflows.  However, diagnosing the problem and solving it are two different tasks, and we'll need the first to attack the second.  Check your syntax against what is in the referenced thread and the suggestion there, and then look at some cross-tabs about where the missing levels are relative to the predictive variables.

Steve Denham

jrushing_wfu
Calcite | Level 5

Thanks Steve.  I will check this out.

Julia

jrushing_wfu
Calcite | Level 5

Steve, do you think the following syntax looks right?  I only want to impute for the variable DXCCS1, and want to use the other variables to help inform that process.  DXCCS1 right now has about 40 levels.  The dataset is huge--2.4 million records.

 

proc mi data=all out= outall nimpute=10;

class dxgrp ccsproc female discharge_loc;

fcs nbiter=20 discrim(dxgrp) ;

var dxgrp ccsproc newlos female discharge_loc age; /* newlos and age continuous*/

run;

even with the large N ,it could still be a sparsity issue I guess, with some levels of CCSPROC (primary procedure, having 200 or so levels) having very little data.

BELOW are the messages I get from SAS when running that code:

 

NOTE: The FCS statement is experimental in this release.

WARNING: The covariates are not specified in an FCS discriminant method for variable dxgrp, only

remaining continuous variables will be used as covariates.

WARNING: The covariates are not specified in an FCS discriminant method for variable ccsproc,

only remaining continuous variables will be used as covariates.

WARNING: The covariates are not specified in an FCS discriminant method for variable FEMALE, only

remaining continuous variables will be used as covariates.

WARNING: The covariates are not specified in an FCS discriminant method for variable

discharge_loc, only remaining continuous variables will be used as covariates.

WARNING: An effect for variable newlos is a linear combination of other effects. The coefficient

of the effect will be set to zero in the imputation.

ERROR: Floating Point Zero Divide.

ERROR: Termination due to Floating Point Exception

thanks for any feedback.

Julia

SteveDenham
Jade | Level 19

Still not sure but try the following :

proc mi data=all out= outall nimpute=10 seed=123;

class dxccs1 dxgrp ccsproc female discharge_loc;

fcs nbiter=20 discrim(dxccs1=dxgrp ccsproc newlos female discharge_loc age/classeffects=include details)

var dxccs1 dxgrp ccsproc newlos female discharge_loc age; /* newlos and age continuous*/

run;

If that still blows things up, you might have some luck using logistic rather than discrim as the method in the fcs statement.  For an example that works, see Example 57.8 FCS Method with Trace Plot.

Steve Denham

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 4524 views
  • 3 likes
  • 2 in conversation