BookmarkSubscribeRSS Feed
michellemabelle
Calcite | Level 5

I have my model parameters from proc logistic pooled using mianalyze. Now, I want to use that model to (2) get the ROC curve for the training set, and (2) make predictions on a validation set. How would I go about doing this. My model is stored as parameter estimates.

 

proc logistic data=outfiles.daysurg07_16_miout; /*logistic regression*/
	class AGE_18to30 AGE_31to40 AGE_51to60 AGE_41to50 AGE_61to70 AGE_71to80 AGE_81plus GENDER_male BMI_obese BMI_overweight BMI_normal BMI_underweight ASACLASS_above3 FNSTATUS_NotIndependent PMHxSMOKING PMHxALCOHOL PMHxCAD PMHxCHF PMHxDIABETES PMHxHTN PMHxSTEROIDS PMHxCOPD PMHxDYSPNEA PMHxTIACVA PMHxNEUROIMP PMHxBLEEDINGDISORDER PMHxRENALFAIL PMHxPVD PMHxCANCER PREGNANT SURG_TYPE;
	model COMPLICATIONS (descending)= AGE_18to30 AGE_31to40 AGE_51to60 AGE_41to50 AGE_61to70 AGE_71to80 AGE_81plus GENDER_male BMI_obese BMI_overweight BMI_normal BMI_underweight ASACLASS_above3 FNSTATUS_NotIndependent PMHxSMOKING PMHxALCOHOL PMHxCAD PMHxCHF PMHxDIABETES PMHxHTN PMHxSTEROIDS PMHxCOPD PMHxDYSPNEA PMHxTIACVA PMHxNEUROIMP PMHxBLEEDINGDISORDER PMHxRENALFAIL PMHxPVD PMHxCANCER PREGNANT SURG_TYPE / selection=stepwise slentry=0.10 slstay=0.15 details lackfit;
	by _imputation_;
	ods output ParameterEstimates=outfiles.lgsparms OddsRatio=outfiles.lgsodds;

proc mianalyze parms(classvar=classval)=outfiles.lgsparms; /*pooling coefficients*/
	class AGE_18to30 AGE_31to40 AGE_51to60 AGE_41to50 AGE_61to70 AGE_71to80 AGE_81plus GENDER_male BMI_obese BMI_overweight BMI_normal BMI_underweight ASACLASS_above3 FNSTATUS_NotIndependent PMHxSMOKING PMHxALCOHOL PMHxCAD PMHxCHF PMHxDIABETES PMHxHTN PMHxSTEROIDS PMHxCOPD PMHxDYSPNEA PMHxTIACVA PMHxNEUROIMP PMHxBLEEDINGDISORDER PMHxRENALFAIL PMHxPVD PMHxCANCER PREGNANT SURG_TYPE;
	modeleffects AGE_18to30 AGE_31to40 AGE_51to60 AGE_41to50 AGE_61to70 AGE_71to80 AGE_81plus GENDER_male BMI_obese BMI_overweight BMI_normal BMI_underweight ASACLASS_above3 FNSTATUS_NotIndependent PMHxSMOKING PMHxALCOHOL PMHxCAD PMHxCHF PMHxDIABETES PMHxHTN PMHxSTEROIDS PMHxCOPD PMHxDYSPNEA PMHxTIACVA PMHxNEUROIMP PMHxBLEEDINGDISORDER PMHxRENALFAIL PMHxPVD PMHxCANCER PREGNANT SURG_TYPE;
	ods output parameterestimates=outfiles.mianalyze_parms_f CovB=outfiles.mianalyze_covb_f;
run;
1 REPLY 1
StatDave
SAS Super FREQ

You can do this by arranging the output data set of parameter estimates from PROC MIANALYZE for use as an INEST= data set for PROC LOGISTIC. You can then run PROC LOGISTIC on your training data set to get the ROC curve and predictions for your validation data. Use the INEST= option to bring in the pooled parameter estimates and the MAXITER=0 to prevent PROC LOGISTIC from changing them. Use the PLOTS=ROC option to request the training data ROC curve and the SCORE statement to score the validation data.

 

The following uses the fish example from the MIANALYZE documentation. The original data is used as both the training and validation data, but you would use your validation data set in the last LOGISTIC run.

 

data Fish2;
   input Species $ Length Width @@;
   datalines;
Parkki  16.5  2.3265    Parkki  17.4  2.3142    .      19.8   .
Parkki  21.3  2.9181    Parkki  22.4  3.2928    .      23.2  3.2944
Parkki  23.2  3.4104    Parkki  24.1  3.1571    .      25.8  3.6636
Parkki  28.0  4.1440    Parkki  29.0  4.2340    Perch   8.8  1.4080
.       14.7  1.9992    Perch   16.0  2.4320    Perch  17.2  2.6316
Perch   18.5  2.9415    Perch   19.2  3.3216    .      19.4   .
Perch   20.2  3.0502    Perch   20.8  3.0368    Perch  21.0  2.7720
Perch   22.5  3.5550    Perch   22.5  3.3075    .      22.5   .
Perch   22.8  3.5340    .       23.5   .        Perch  23.5  3.5250
Perch   23.5  3.5250    Perch   23.5  3.5250    Perch  23.5  3.9950
.       24.0   .        Perch   24.0  3.6240    Perch  24.2  3.6300
Perch   24.5  3.6260    Perch   25.0  3.7250    .      25.5  3.7230
Perch   25.5  3.8250    Perch   26.2  4.1658    Perch  26.5  3.6835
.       27.0  4.2390    Perch   28.0  4.1440    Perch  28.7  5.1373
.       28.9  4.3350    .       28.9   .        .      28.9  4.5662
Perch   29.4  4.2042    Perch   30.1  4.6354    Perch  31.6  4.7716
Perch   34.0  6.0180    .       36.5  6.3875    .      37.3  7.7957
.       39.0   .        .       38.3   .        Perch  39.4  6.2646
Perch   39.3  6.3666    Perch   41.4  7.4934    Perch  41.4  6.0030
Perch   41.3  7.3514    .       42.3   .        Perch  42.5  7.2250
Perch   42.4  7.4624    Perch   42.5  6.6300    Perch  44.6  6.8684
Perch   45.2  7.2772    Perch   45.5  7.4165    Perch  46.0  8.1420
Perch   46.6  7.5958
;
proc mi data=Fish2 seed=1305417 out=outfish2;
   class Species;
   monotone logistic( Species= Length Width);
   var Length Width Species;
run;
ods select none;
proc logistic data=outfish2;
   class Species;
   model Species= Length Width / covb;
   by _Imputation_;
   ods output ParameterEstimates=lgsparms;
   run;
ods select all;
proc mianalyze parms=lgsparms;
   modeleffects Intercept Length Width;
   ods output parameterestimates=pe;
   run;

proc transpose data=pe out=tpe; 
   var estimate; id parm; 
   run;
data train; set fish2; run;
data valid; set fish2; run;
proc logistic data=train inest=tpe plots(only)=roc;
model Species= Length Width / maxiter=0;
score data=valid out=valpred2;
run;

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1561 views
  • 0 likes
  • 2 in conversation