I have my model parameters from proc logistic pooled using mianalyze. Now, I want to use that model to (2) get the ROC curve for the training set, and (2) make predictions on a validation set. How would I go about doing this. My model is stored as parameter estimates.
proc logistic data=outfiles.daysurg07_16_miout; /*logistic regression*/
class AGE_18to30 AGE_31to40 AGE_51to60 AGE_41to50 AGE_61to70 AGE_71to80 AGE_81plus GENDER_male BMI_obese BMI_overweight BMI_normal BMI_underweight ASACLASS_above3 FNSTATUS_NotIndependent PMHxSMOKING PMHxALCOHOL PMHxCAD PMHxCHF PMHxDIABETES PMHxHTN PMHxSTEROIDS PMHxCOPD PMHxDYSPNEA PMHxTIACVA PMHxNEUROIMP PMHxBLEEDINGDISORDER PMHxRENALFAIL PMHxPVD PMHxCANCER PREGNANT SURG_TYPE;
model COMPLICATIONS (descending)= AGE_18to30 AGE_31to40 AGE_51to60 AGE_41to50 AGE_61to70 AGE_71to80 AGE_81plus GENDER_male BMI_obese BMI_overweight BMI_normal BMI_underweight ASACLASS_above3 FNSTATUS_NotIndependent PMHxSMOKING PMHxALCOHOL PMHxCAD PMHxCHF PMHxDIABETES PMHxHTN PMHxSTEROIDS PMHxCOPD PMHxDYSPNEA PMHxTIACVA PMHxNEUROIMP PMHxBLEEDINGDISORDER PMHxRENALFAIL PMHxPVD PMHxCANCER PREGNANT SURG_TYPE / selection=stepwise slentry=0.10 slstay=0.15 details lackfit;
by _imputation_;
ods output ParameterEstimates=outfiles.lgsparms OddsRatio=outfiles.lgsodds;
proc mianalyze parms(classvar=classval)=outfiles.lgsparms; /*pooling coefficients*/
class AGE_18to30 AGE_31to40 AGE_51to60 AGE_41to50 AGE_61to70 AGE_71to80 AGE_81plus GENDER_male BMI_obese BMI_overweight BMI_normal BMI_underweight ASACLASS_above3 FNSTATUS_NotIndependent PMHxSMOKING PMHxALCOHOL PMHxCAD PMHxCHF PMHxDIABETES PMHxHTN PMHxSTEROIDS PMHxCOPD PMHxDYSPNEA PMHxTIACVA PMHxNEUROIMP PMHxBLEEDINGDISORDER PMHxRENALFAIL PMHxPVD PMHxCANCER PREGNANT SURG_TYPE;
modeleffects AGE_18to30 AGE_31to40 AGE_51to60 AGE_41to50 AGE_61to70 AGE_71to80 AGE_81plus GENDER_male BMI_obese BMI_overweight BMI_normal BMI_underweight ASACLASS_above3 FNSTATUS_NotIndependent PMHxSMOKING PMHxALCOHOL PMHxCAD PMHxCHF PMHxDIABETES PMHxHTN PMHxSTEROIDS PMHxCOPD PMHxDYSPNEA PMHxTIACVA PMHxNEUROIMP PMHxBLEEDINGDISORDER PMHxRENALFAIL PMHxPVD PMHxCANCER PREGNANT SURG_TYPE;
ods output parameterestimates=outfiles.mianalyze_parms_f CovB=outfiles.mianalyze_covb_f;
run;
You can do this by arranging the output data set of parameter estimates from PROC MIANALYZE for use as an INEST= data set for PROC LOGISTIC. You can then run PROC LOGISTIC on your training data set to get the ROC curve and predictions for your validation data. Use the INEST= option to bring in the pooled parameter estimates and the MAXITER=0 to prevent PROC LOGISTIC from changing them. Use the PLOTS=ROC option to request the training data ROC curve and the SCORE statement to score the validation data.
The following uses the fish example from the MIANALYZE documentation. The original data is used as both the training and validation data, but you would use your validation data set in the last LOGISTIC run.
data Fish2;
input Species $ Length Width @@;
datalines;
Parkki 16.5 2.3265 Parkki 17.4 2.3142 . 19.8 .
Parkki 21.3 2.9181 Parkki 22.4 3.2928 . 23.2 3.2944
Parkki 23.2 3.4104 Parkki 24.1 3.1571 . 25.8 3.6636
Parkki 28.0 4.1440 Parkki 29.0 4.2340 Perch 8.8 1.4080
. 14.7 1.9992 Perch 16.0 2.4320 Perch 17.2 2.6316
Perch 18.5 2.9415 Perch 19.2 3.3216 . 19.4 .
Perch 20.2 3.0502 Perch 20.8 3.0368 Perch 21.0 2.7720
Perch 22.5 3.5550 Perch 22.5 3.3075 . 22.5 .
Perch 22.8 3.5340 . 23.5 . Perch 23.5 3.5250
Perch 23.5 3.5250 Perch 23.5 3.5250 Perch 23.5 3.9950
. 24.0 . Perch 24.0 3.6240 Perch 24.2 3.6300
Perch 24.5 3.6260 Perch 25.0 3.7250 . 25.5 3.7230
Perch 25.5 3.8250 Perch 26.2 4.1658 Perch 26.5 3.6835
. 27.0 4.2390 Perch 28.0 4.1440 Perch 28.7 5.1373
. 28.9 4.3350 . 28.9 . . 28.9 4.5662
Perch 29.4 4.2042 Perch 30.1 4.6354 Perch 31.6 4.7716
Perch 34.0 6.0180 . 36.5 6.3875 . 37.3 7.7957
. 39.0 . . 38.3 . Perch 39.4 6.2646
Perch 39.3 6.3666 Perch 41.4 7.4934 Perch 41.4 6.0030
Perch 41.3 7.3514 . 42.3 . Perch 42.5 7.2250
Perch 42.4 7.4624 Perch 42.5 6.6300 Perch 44.6 6.8684
Perch 45.2 7.2772 Perch 45.5 7.4165 Perch 46.0 8.1420
Perch 46.6 7.5958
;
proc mi data=Fish2 seed=1305417 out=outfish2;
class Species;
monotone logistic( Species= Length Width);
var Length Width Species;
run;
ods select none;
proc logistic data=outfish2;
class Species;
model Species= Length Width / covb;
by _Imputation_;
ods output ParameterEstimates=lgsparms;
run;
ods select all;
proc mianalyze parms=lgsparms;
modeleffects Intercept Length Width;
ods output parameterestimates=pe;
run;
proc transpose data=pe out=tpe;
var estimate; id parm;
run;
data train; set fish2; run;
data valid; set fish2; run;
proc logistic data=train inest=tpe plots(only)=roc;
model Species= Length Width / maxiter=0;
score data=valid out=valpred2;
run;
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.