BookmarkSubscribeRSS Feed
Crystal_F
Quartz | Level 8

Hello everyone,

 

 

I am working on a survey dataset with missing values on two categorical variables : f_base_meanImputedIncomeCat and f_base_CollegeandAbove(15%). In the multiple imputation step, i ran the following code:

 

proc mi data=final.model1and2_clean seed=875 nimpute=5 out=temp.outfcs ;
class race_f gender_f f_base_married f_base_insurance  f_base_meanImputedIncomeCat f_base_CollegeandAbove; 
fcs logistic ( f_base_meanImputedIncomeCat f_base_CollegeandAbove ) ;
var age_skIPindex race_f gender_f f_base_married f_base_insurance  f_base_meanImputedIncomeCat f_base_CollegeandAbove ;
run;

And I got warning message:

 

WARNING: The covariates are not specified in an FCS discriminant method for variable race_f, only remaining continuous
         variables will be used as covariates with the default CLASSEFFECTS=EXCLUDE option.
WARNING: The covariates are not specified in an FCS discriminant method for variable gender_f, only remaining continuous
         variables will be used as covariates with the default CLASSEFFECTS=EXCLUDE option.
WARNING: The covariates are not specified in an FCS discriminant method for variable f_base_Married, only remaining continuous
         variables will be used as covariates with the default CLASSEFFECTS=EXCLUDE option.
WARNING: The covariates are not specified in an FCS discriminant method for variable f_base_Insurance, only remaining
         continuous variables will be used as covariates with the default CLASSEFFECTS=EXCLUDE option.
WARNING: The maximum likelihood estimates for the logistic regression with observed observations may not exist for variable
         f_base_CollegeandAbove. The posterior predictive distribution of the parameters used in the imputation process is
         based on the maximum likelihood estimates in the last maximum likelihood iteration.

As I don't need to impute all other categorical variables, I assumed this would be fine and moved on to the next step, i.e. using the imputed data set to run the cox model with a survey dataset.

 

proc format;
value college
0='Less than college'
1='College and above'
;
run; proc format; value Insurance 0='Just Medicare' 1='Medicaid' 2='Any other supplemental insurance' ; run; proc format; value meanIncome 1 ="Less than $12,102" 2 ="$12,102-$21,000" 3 ="$21,001-$34,409" 4 ="$34,410-$60,000" 5 ="$60,001+" ; run; proc format; value race 1="White" 2="Black" ; run; proc format; value gender 1="Male" 2="Female" ; run; proc surveyphreg data = temp.outfcs; format race_f race.; format gender_f gender.; format f_base_insurance insurance.; format f_base_meanImputedIncomeCat meanIncome.; format f_base_CollegeandAbove college.; weight Int_baseweight; strata varstrat; cluster varunit; class race_f(ref='White') gender_f (ref='Male') flag_rehospNONsk(ref='0') flag_rehospSk(ref='0') ICH (ref='0') type_fstrehab(ref='Non') f_base_CollegeandAbove(ref='Less than college') f_base_married(ref='0') f_base_insurance(ref='Just Medicare') f_base_meanImputedIncomeCat (ref='$60,001+') HEbeforeSKIPindex (ref='0') / param=ref; model lenfol_IP*censor_death_f(0) = race_f gender_f age_skIPindex flag_rehospNONsk flag_rehospSk ICH complicationIndex_IP ProcedureIndex_IP LOS_InitialIP type_fstrehab LOS_fstrehab complicationIndex_fstrehab transition_number sum_therapy_mins_90day comorbidityIndex f_base_CollegeandAbove f_base_married f_base_insurance f_base_meanImputedIncomeCat f_base_networkcount HEbeforeSKIPindex / risklimit ; ods output parameterEstimates = gentemp.surveyphregest; by _imputation_; run;

It seemed fine at this point:

 

NOTE: The BY statement provides completely separate analyses of the BY groups.  It does not provide a statistically valid
      subpopulation or domain analysis, where the total number of units in the subpopulation is not known with certainty. If
      you want a domain analysis, you should include the DOMAIN variables in a DOMAIN statement.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: At least one element of the gradient is greater than 1e-3.
WARNING: Stratum level 3 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 8 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 11 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 26 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 30 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 43 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 44 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 46 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 48 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 50 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
NOTE: The above message was for the following BY group:
      Imputation Number=1
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: At least one element of the gradient is greater than 1e-3.

But I got error message after the the following code was run:

 

proc mianalyze parms (CLASSVAR=CLASSVAL)=gentemp.surveyphregest edf=42;
	class race_f  gender_f 
      	  flag_rehospNONsk flag_rehospSk ICH
          type_fstrehab
          f_base_CollegeandAbove  f_base_married 
          f_base_insurance f_base_meanImputedIncomeCat 
          HEbeforeSKIPindex ;
	modeleffects        race_f gender_f age_skIPindex 
                        flag_rehospNONsk flag_rehospSk ICH 
                        complicationIndex_IP ProcedureIndex_IP LOS_InitialIP 
						type_fstrehab LOS_fstrehab 
                        complicationIndex_fstrehab 
						transition_number
						sum_therapy_mins_90day
						comorbidityIndex
                        f_base_CollegeandAbove  f_base_married  f_base_insurance f_base_meanImputedIncomeCat f_base_networkcount HEbeforeSKIPindex 
						;
run;
ERROR: Variable ClassVal0 is not in the PARMS= data set.
NOTE: The SAS System stopped processing this step because of errors.

I don't know which part caused the error. My guesses are:

1. variable too long and truncated like sum_therapy_mins_90day?

2. CLASSVAR=CLASSVAL is not for the surveyphreg procedure. But i also tried option: full and level. None of them worked for my case.

 

 

Any suggestion would be highly valued!

 

Thank you for your help!

 

4 REPLIES 4
SAS_Rob
SAS Employee

The issue, as you might have guessed, is that SURVEYPHREG does not really create a data set compatible with any of the CLASSVAR= options.  It takes some modification of the ParameterEstimates data set.  Specifically, in order for PROC MIANALYZE to use this table it is necessary to modify the PARAMETER variable such that its value in each row is a valid SAS name.

The following DATA step uses the SCAN function to pick out the first level (second word) in the PARAMETER variable.

 

data parms_class;
set gentemp.surveyphregest;

grp = scan(parameter,2,' ');
if grp='' then effect=parameter;
else effect = 'grp';
run;


proc mianalyze parms(classvar=full)=parms_class edf=42;...

Crystal_F
Quartz | Level 8

Hi Rob,

 

Thank you for your help. I changed the code as suggested:

 

data gentemp.parms_class; 
set gentemp.surveyphregest;
grp = scan(parameter,2,' ');
if grp='' then effect=parameter;
else effect = 'grp';
run;

proc mianalyze parms (CLASSVAR=full)=gentemp.parms_class edf=42;
	class race_f  gender_f 
      	  flag_rehospNONsk flag_rehospSk ICH
          type_fstrehab
          f_base_CollegeandAbove  f_base_married 
          f_base_insurance f_base_meanImputedIncomeCat 
          HEbeforeSKIPindex ;
	modeleffects        race_f gender_f age_skIPindex 
                        /*var from the initial ip clm */
                        flag_rehospNONsk flag_rehospSk ICH 
                        complicationIndex_IP ProcedureIndex_IP LOS_InitialIP 
						/*var from the first rehab*/
						type_fstrehab LOS_fstrehab 
                        complicationIndex_fstrehab 
						/*var from the transition dataset*/
						transition_number
						/*var from the thearpy estimates, only 90-day agg is needed*/  
						sum_therapy_mins_90day
                        /*var from the beneficiary cc file*/
						comorbidityIndex
						/*nhats baseline vars*/
                        f_base_CollegeandAbove  f_base_married  f_base_insurance f_base_meanImputedIncomeCat f_base_networkcount HEbeforeSKIPindex 
						;
run;

There is an error message:

ERROR: Variable race_f is not in the PARMS= data set.

In the gentemp.parms_class dataset, the value of the variable grp is "Black", the value of the variable effect is 'grp'. I wonder if there is any solution to the new problem. Thank you!

SAS_Rob
SAS Employee

Sorry for the confusion.  I think the variable name was wrong in my code.

 

Try this instead:

data gentemp.parms_class; 
set gentemp.surveyphregest;
race_f = scan(parameter,2,' ');
if race_f='' then effect=parameter;
else effect = 'race_f';
run;

proc mianalyze parms (CLASSVAR=full)=gentemp.parms_class edf=42;
	class race_f;
modeleffects race_f;
run;


If this works then you will need to do something similar for each of the CLASS variables.

Crystal_F
Quartz | Level 8

Hi Rob,

 

Thanks again for making extra efforts on clarification. But I'm lost in building the right data structure for SAS to read in the imputation estimates. I tried your code and it did provide me the result without an error. But I'm not sure if i got this right. It turned out all the parameter was called 'race_f'. I wonder which variable is crucial for SAS to generate the result from proc mianalyze, Am I trying to recreate a variable and keep its categorical value from the parameter column? And for the continuous variable .... Sorry, I was confused.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 2201 views
  • 0 likes
  • 2 in conversation