Hi everyone,
I've been stuck on a problem for a while and hope that someone could help me out here. Any suggestion will be highly appreciated!
I'm trying to run the cox model in a survey dataset with missing value on two categorical variables, i.e. income and education. The codes ran well until the final step when i tried to get parameter estimates from proc mianalyze. I received an error message:
ERROR: Variable black_model is not in the PARMS= data set.
The cox model has both continuous and categorical variables. And I only used a selected set of variables to impute income and education.
proc mi data=final.model1and2_clean seed=875 nimpute=5 out=temp.outfcs ;
class black_model male_model married_model f_base_insurance f_base_meanImputedIncomeCat f_base_CollegeandAbove;
fcs logistic ( f_base_meanImputedIncomeCat f_base_CollegeandAbove ) ;
var age_skIPindex race_f gender_f f_base_married f_base_insurance f_base_meanImputedIncomeCat f_base_CollegeandAbove ;
run;
data temp.outfcs_1;
set temp.outfcs;
black_model=(race_f=1);
male_model=(gender_f=2);
married_model=(f_base_married=0);
insurance_medicaid=f_base_insurance=1;
insurance_other=f_base_insurance=2;
insurance_justmedicare=f_base_insurance=0;
income1=f_base_meanImputedIncomeCat=1;
income2=f_base_meanImputedIncomeCat=2;
income3=f_base_meanImputedIncomeCat=3;
income4=f_base_meanImputedIncomeCat=4;
income5=f_base_meanImputedIncomeCat=5;
college_model=f_base_CollegeandAbove=0;
HEbeforeSKIPindex_model=HEbeforeSKIPindex=0;
rehospNONsk_model=flag_rehospNONsk;
rehospSk_model=flag_rehospSk=0;
ICH_model=ich=0;
Nofstrehab=type_fstrehab='Non';
irffstrehab=type_fstrehab='irf';
snffstrehab=type_fstrehab='snf';
run;
proc surveyphreg data = temp.outfcs_1;
weight Int_baseweight;
strata varstrat;
cluster varunit;
class black_model male_model
rehospNONsk_model rehospSk_model ICH_model
irffstrehab snffstrehab
college_model
income1 income2 income3 income4 married_model
insurance_medicaid insurance_other
HEbeforeSKIPindex_model;
model lenfol_IP*censor_death_f(0) = black_model male_model age_skIPindex
rehospNONsk_model rehospSk_model ICH_model
complicationIndex_IP ProcedureIndex_IP LOS_InitialIP
irffstrehab snffstrehab
complicationIndex_fstrehab
transition_number
sum_therapy_mins_90day
comorbidityIndex
college_model income1 income2 income3 income4 married_model
insurance_medicaid insurance_other f_base_networkcount HEbeforeSKIPindex_model / risklimit
;
ods output parameterEstimates = gentemp.surveyphregest4;
by _imputation_;
run;
proc mianalyze parms =gentemp.surveyphregest4 edf=42;
class black_model male_model
rehospNONsk_model rehospSk_model ICH_model
irffstrehab snffstrehab
college_model
income1 income2 income3 income4 married_model
insurance_medicaid insurance_other
HEbeforeSKIPindex_model ;
modeleffects black_model male_model age_skIPindex
rehospNONsk_model rehospSk_model ICH_model
complicationIndex_IP ProcedureIndex_IP LOS_InitialIP
irffstrehab snffstrehab
complicationIndex_fstrehab
transition_number
sum_therapy_mins_90day
comorbidityIndex
college_model income1 income2 income3 income4 married_model
insurance_medicaid insurance_other f_base_networkcount HEbeforeSKIPindex_model
;
run;
When there are CLASS variables in the model, the ODS table ParameterEstimates in PROC SURVEYPHREG does not match any of the available CLASSVAR= formats that PROC MIANALYZE requires. However, with minimal Data Step coding, the data set can be converted to a format that PROC MIANAYZE can readily use.
Try this example:
data mortality;
input ID VARSTRATA VARPSU SWEIGHT AGE VITALSTATUS POvARIND GENDER;
x=rannor(123);
datalines;
1 03 1 13312 66 1 1 1
2 03 1 7941 71 3 1 2
3 03 1 16048 . 4 1 1
4 03 3 9298 58 3 1 1
5 03 2 15336 56 3 1 2
6 03 1 14744 63 1 1 1
7 03 2 83729 70 1 2 2
8 03 3 106492 57 1 2 1
9 03 3 78083 81 3 2 2
10 03 3 55957 79 3 2 1
11 03 3 83729 68 1 2 2
12 03 1 78083 78 3 2 2
13 03 2 13824 78 1 1 2
14 03 3 13824 70 3 1 2
15 03 3 44649 50 1 1 2
16 03 1 9298 . 6 1 1
17 03 1 13824 77 1 1 2
18 03 3 4767 82 3 1 1
19 03 3 15336 56 3 1 2
20 03 3 16048 68 3 1 1
21 03 1 9298 74 1 1 1
22 03 2 14744 . 6 1 1
23 03 2 4767 77 3 1 1
24 03 2 16048 65 3 1 1
25 03 1 106492 61 1 2 1
26 03 3 170748 . 1 2 2
27 03 2 9298 . 1 1 1
28 03 1 78083 89 1 2 2
29 03 1 170748 58 1 2 2
30 03 2 20029 64 1 1 2
31 03 3 20029 63 1 1 2
32 03 1 32595 38 3 1 1
33 03 1 83729 70 1 2 2
34 03 2 110606 67 3 2 1
35 03 2 96469 77 1 2 2
36 03 3 55957 90 1 2 1
37 03 1 96469 69 3 2 2
38 03 2 106492 59 1 2 1
39 03 2 34328 50 1 1 2
40 03 2 13826 61 3 1 1
41 03 2 10466 72 1 1 2
42 03 2 21344 57 1 1 2
43 03 3 12059 78 1 1 2
44 03 2 9298 68 1 1 1
45 03 3 96469 71 3 2 2
46 08 3 5825 64 3 1 1
47 08 3 6656 58 3 1 1
48 08 3 2570 86 3 1 2
49 08 1 7282 49 1 1 2
50 08 2 6280 52 1 1 1
51 08 1 5825 66 1 1 1
52 08 2 7282 56 1 1 2
53 08 1 58254 51 1 2 2
54 08 2 68404 78 3 2 1
55 08 3 53246 64 1 2 1
56 08 1 7972 64 3 1 2
57 08 2 7282 50 1 1 2
58 08 1 50242 55 3 2 1
59 08 3 29377 81 3 2 1
60 08 3 29377 72 3 2 1
61 08 1 7106 75 1 1 2
62 08 1 2570 91 1 1 2
63 08 2 7972 83 1 1 2
64 08 1 8551 71 3 1 1
65 08 2 10413 . 1 1 2
66 08 3 46598 73 1 2 1
67 08 2 20558 78 3 2 2
68 08 1 20558 85 3 2 2
69 08 1 46598 67 1 2 1
70 08 3 83303 72 1 2 2
71 08 2 50242 47 1 2 1
72 08 3 68404 75 3 2 1
73 08 3 20558 88 3 2 2
74 08 1 63777 78 1 2 2
75 08 2 16725 75 3 2 2
76 08 2 70470 . 5 2 2
77 08 1 29377 80 3 2 1
78 08 1 53246 53 3 2 1
79 08 2 29377 78 3 2 1
80 08 3 70470 60 1 2 2
81 08 3 20558 89 1 2 2
82 08 1 56851 68 1 2 2
83 08 1 75098 65 1 2 2
84 08 1 68404 78 1 2 1
85 08 2 50242 52 1 2 1
86 08 1 63777 78 1 2 2
87 08 3 56851 71 1 2 2
88 11 1 113592 63 1 2 1
89 11 1 47843 86 1 2 1
90 11 1 113592 64 1 2 1
91 11 1 90096 87 3 2 2
92 11 1 99238 70 1 2 2
93 11 1 105885 85 1 2 2
94 11 1 77295 54 1 2 1
95 11 1 110393 . 4 2 2
96 11 1 110393 . 5 2 2
97 11 1 105237 66 3 2 1
98 11 2 47843 82 5 2 1
99 11 2 86027 65 6 2 1
100 11 2 12405 69 4 1 2
;
proc mi data=mortality out=mort_imp seed=123 nimpute=5;
class povarind gender vitalstatus;
var povarind gender vitalstatus x age;
monotone regression;
run;
proc surveyphreg data = mort_imp;
by _imputation_;
class povarind gender;
strata varstrata;
cluster varpsu;
weight sweight;
model age*vitalstatus(1 4 5 6) = povarind gender x;
hazardratio povarind;
hazardratio gender;
ods output ParameterEstimates=parms;
run;
The created data does not match any of the available CLASSVAR= formats. But the variable PARAMETER does contain all the information that is needed to reformat it such that it will match the format of CLASSVAR=FULL. In order for PROC MIANALYZE to use this table it is necessary to modify the PARAMETER variable such that its value in each row is a valid SAS name. The following DATA step uses the SCAN function to pick out the first level (second word) in the PARAMETER variable. Since there are multiple CLASS variables, it is necessary to check both the parameter names and levels. In order to avoid any issues with case-sensitivity, the UPCASE function is employed. The last ELSE statement in the series is for the continuous variable. You will need to have as many IF/THEN ELSEs as you have CLASS variables.
data parms_class;
set parms;
if upcase(scan(parameter,1))='POVARIND' then do;
effect='POVARIND';
povarind=scan(parameter,2,' ');
end;
else if upcase(scan(parameter,1))='GENDER' then do;
effect='GENDER';
gender=scan(parameter,2,' ');
end;
else effect=parameter;
run;
Finally PROC MIANALYZE is run using the CLASSVAR=FULL sub-option.
proc mianalyze parms(classvar=full)=parms_class;
class povarind gender;
modeleffects povarind gender x;
run;
If you also want HR then you would have to have an additional step since PROC SURVEYPHREG does not include standard errors for the hazard ratios.
proc mianalyze parms(classvar=full)=parms_class;
class povarind gender;
modeleffects povarind gender x;
ods output ParameterEstimates=comb_parms;
run;
data comb_parms;
set comb_parms;
hr=exp(estimate);
lower_hr=exp(lclmean);
upper_hr=exp(uclmean);
run;
proc print data=comb_parms;
title 'Hazard Ratio Estimates';
var Parm Povarind Gender HR lower_hr upper_hr;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.