Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- problem with proc mianalyze after the proc surveyphreg was performed

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 10-30-2018 10:42 AM
(1791 views)

Hello everyone,

I am working on a survey dataset with missing values on two categorical variables : f_base_meanImputedIncomeCat and f_base_CollegeandAbove(15%). In the multiple imputation step, i ran the following code:

```
proc mi data=final.model1and2_clean seed=875 nimpute=5 out=temp.outfcs ;
class race_f gender_f f_base_married f_base_insurance f_base_meanImputedIncomeCat f_base_CollegeandAbove;
fcs logistic ( f_base_meanImputedIncomeCat f_base_CollegeandAbove ) ;
var age_skIPindex race_f gender_f f_base_married f_base_insurance f_base_meanImputedIncomeCat f_base_CollegeandAbove ;
run;
```

And I got warning message:

```
WARNING: The covariates are not specified in an FCS discriminant method for variable race_f, only remaining continuous
variables will be used as covariates with the default CLASSEFFECTS=EXCLUDE option.
WARNING: The covariates are not specified in an FCS discriminant method for variable gender_f, only remaining continuous
variables will be used as covariates with the default CLASSEFFECTS=EXCLUDE option.
WARNING: The covariates are not specified in an FCS discriminant method for variable f_base_Married, only remaining continuous
variables will be used as covariates with the default CLASSEFFECTS=EXCLUDE option.
WARNING: The covariates are not specified in an FCS discriminant method for variable f_base_Insurance, only remaining
continuous variables will be used as covariates with the default CLASSEFFECTS=EXCLUDE option.
WARNING: The maximum likelihood estimates for the logistic regression with observed observations may not exist for variable
f_base_CollegeandAbove. The posterior predictive distribution of the parameters used in the imputation process is
based on the maximum likelihood estimates in the last maximum likelihood iteration.
```

As I don't need to impute all other categorical variables, I assumed this would be fine and moved on to the next step, i.e. using the imputed data set to run the cox model with a survey dataset.

`proc format;`

value college

0='Less than college'

1='College and above'

;

run;
proc format;
value Insurance
0='Just Medicare'
1='Medicaid'
2='Any other supplemental insurance'
;
run;
proc format;
value meanIncome
1 ="Less than $12,102"
2 ="$12,102-$21,000"
3 ="$21,001-$34,409"
4 ="$34,410-$60,000"
5 ="$60,001+"
;
run;
proc format;
value race
1="White"
2="Black"
;
run;
proc format;
value gender
1="Male"
2="Female"
;
run;
proc surveyphreg data = temp.outfcs;
format race_f race.;
format gender_f gender.;
format f_base_insurance insurance.;
format f_base_meanImputedIncomeCat meanIncome.;
format f_base_CollegeandAbove college.;
weight Int_baseweight;
strata varstrat;
cluster varunit;
class race_f(ref='White') gender_f (ref='Male')
flag_rehospNONsk(ref='0') flag_rehospSk(ref='0') ICH (ref='0')
type_fstrehab(ref='Non')
f_base_CollegeandAbove(ref='Less than college') f_base_married(ref='0')
f_base_insurance(ref='Just Medicare') f_base_meanImputedIncomeCat (ref='$60,001+')
HEbeforeSKIPindex (ref='0') / param=ref;
model lenfol_IP*censor_death_f(0) = race_f gender_f age_skIPindex
flag_rehospNONsk flag_rehospSk ICH
complicationIndex_IP ProcedureIndex_IP LOS_InitialIP
type_fstrehab LOS_fstrehab
complicationIndex_fstrehab
transition_number
sum_therapy_mins_90day
comorbidityIndex
f_base_CollegeandAbove f_base_married f_base_insurance f_base_meanImputedIncomeCat f_base_networkcount HEbeforeSKIPindex / risklimit
;
ods output parameterEstimates = gentemp.surveyphregest;
by _imputation_;
run;

It seemed fine at this point:

```
NOTE: The BY statement provides completely separate analyses of the BY groups. It does not provide a statistically valid
subpopulation or domain analysis, where the total number of units in the subpopulation is not known with certainty. If
you want a domain analysis, you should include the DOMAIN variables in a DOMAIN statement.
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: At least one element of the gradient is greater than 1e-3.
WARNING: Stratum level 3 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 8 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 11 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 26 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 30 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 43 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 44 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 46 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 48 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
WARNING: Stratum level 50 contains only a single cluster. Single-cluster strata are not included in the variance estimates.
NOTE: The above message was for the following BY group:
Imputation Number=1
NOTE: Convergence criterion (GCONV=1E-8) satisfied.
NOTE: At least one element of the gradient is greater than 1e-3.
```

But I got error message after the the following code was run:

```
proc mianalyze parms (CLASSVAR=CLASSVAL)=gentemp.surveyphregest edf=42;
class race_f gender_f
flag_rehospNONsk flag_rehospSk ICH
type_fstrehab
f_base_CollegeandAbove f_base_married
f_base_insurance f_base_meanImputedIncomeCat
HEbeforeSKIPindex ;
modeleffects race_f gender_f age_skIPindex
flag_rehospNONsk flag_rehospSk ICH
complicationIndex_IP ProcedureIndex_IP LOS_InitialIP
type_fstrehab LOS_fstrehab
complicationIndex_fstrehab
transition_number
sum_therapy_mins_90day
comorbidityIndex
f_base_CollegeandAbove f_base_married f_base_insurance f_base_meanImputedIncomeCat f_base_networkcount HEbeforeSKIPindex
;
run;
```

```
ERROR: Variable ClassVal0 is not in the PARMS= data set.
NOTE: The SAS System stopped processing this step because of errors.
```

I don't know which part caused the error. My guesses are:

1. variable too long and truncated like sum_therapy_mins_90day?

2. CLASSVAR=CLASSVAL is not for the surveyphreg procedure. But i also tried option: full and level. None of them worked for my case.

Any suggestion would be highly valued!

Thank you for your help!

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The issue, as you might have guessed, is that SURVEYPHREG does not really create a data set compatible with any of the CLASSVAR= options. It takes some modification of the ParameterEstimates data set. Specifically, in order for PROC MIANALYZE to use this table it is necessary to modify the PARAMETER variable such that its value in each row is a valid SAS name.

The following DATA step uses the SCAN function to pick out the first level (second word) in the PARAMETER variable.

data parms_class;

set gentemp.surveyphregest;

grp = scan(parameter,2,' ');

if grp='' then effect=parameter;

else effect = 'grp';

run;

proc mianalyze parms(classvar=full)=parms_class edf=42;...

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi Rob,

Thank you for your help. I changed the code as suggested:

```
data gentemp.parms_class;
set gentemp.surveyphregest;
grp = scan(parameter,2,' ');
if grp='' then effect=parameter;
else effect = 'grp';
run;
proc mianalyze parms (CLASSVAR=full)=gentemp.parms_class edf=42;
class race_f gender_f
flag_rehospNONsk flag_rehospSk ICH
type_fstrehab
f_base_CollegeandAbove f_base_married
f_base_insurance f_base_meanImputedIncomeCat
HEbeforeSKIPindex ;
modeleffects race_f gender_f age_skIPindex
/*var from the initial ip clm */
flag_rehospNONsk flag_rehospSk ICH
complicationIndex_IP ProcedureIndex_IP LOS_InitialIP
/*var from the first rehab*/
type_fstrehab LOS_fstrehab
complicationIndex_fstrehab
/*var from the transition dataset*/
transition_number
/*var from the thearpy estimates, only 90-day agg is needed*/
sum_therapy_mins_90day
/*var from the beneficiary cc file*/
comorbidityIndex
/*nhats baseline vars*/
f_base_CollegeandAbove f_base_married f_base_insurance f_base_meanImputedIncomeCat f_base_networkcount HEbeforeSKIPindex
;
run;
```

There is an error message:

`ERROR: Variable race_f is not in the PARMS= data set.`

In the gentemp.parms_class dataset, the value of the variable grp is "Black", the value of the variable effect is 'grp'. I wonder if there is any solution to the new problem. Thank you!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Sorry for the confusion. I think the variable name was wrong in my code.

Try this instead:

```
data gentemp.parms_class;
set gentemp.surveyphregest;
race_f = scan(parameter,2,' ');
if race_f='' then effect=parameter;
else effect = 'race_f';
run;
proc mianalyze parms (CLASSVAR=full)=gentemp.parms_class edf=42;
class race_f;
```

modeleffects race_f;

run;

If this works then you will need to do something similar for each of the CLASS variables.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi Rob,

Thanks again for making extra efforts on clarification. But I'm lost in building the right data structure for SAS to read in the imputation estimates. I tried your code and it did provide me the result without an error. But I'm not sure if i got this right. It turned out all the parameter was called 'race_f'. I wonder which variable is crucial for SAS to generate the result from proc mianalyze, Am I trying to recreate a variable and keep its categorical value from the parameter column? And for the continuous variable .... Sorry, I was confused.

**Don't miss out on SAS Innovate - Register now for the FREE Livestream!**

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.