DATA Step, Macro, Functions and more

A Macro For Getting More Out Of Your ROC Curve: Cut off for best sensitivity Specificity trade

Accepted Solution Solved
Reply
Regular Contributor
Posts: 164
Accepted Solution

A Macro For Getting More Out Of Your ROC Curve: Cut off for best sensitivity Specificity trade

Hi everyone,

I really hope someone will take the time and effort to correct this Macro, I have have been trying to figure out what the problem is but in vain. 

This is a Macro that calculates the optimal cut off trae for sensitivity and specificty, the priginal post back from 2008 is available here:

 

http://www2.sas.com/proceedings/forum2008/231-2008.pdf

 

This is a great macro of use to many people. It has been used in several published papers, now it seems to be broken, and I am trying to fix it without any luck, but I am not a power programmer as many others here are, hoping someone will solve it for all of us needing this macro otherwise.

Here is the code: I think the error come after the line with the Data _Null_ step.

 

%MACRO SNSP_TRADEOFF(
/**********************************************************************
MACRO NAME : SNSP_TRADEOFF
PURPOSE : CREATES A GRAPH WITH MULTIPLE ROC MEASURES
SAS VERSION : VERSION 8.2, 9.1
PARAMETERS :
-----------------------------------------------------------------------
NAME TYPE DEFAULT DESCRIPTION AND VALID VALUES
--------- -------- -------- -----------------------------------------
DATASET REQUIRED : SOURCE DATASET
OUTCOME REQUIRED : RESPONSE VARIABLE (CATEGORICAL)
OUTCOME_LEV REQUIRED : EVENT OF INTEREST. A LEVEL OF THE OUTCOME
PARAMETER THAT DEFINES THE EVENT
(E.G., OUTCOME_LEV=1 OR OUTCOME_LEV=RELAPSE)
XVAR REQUIRED : CONTINUOUS PREDICTOR
FONT OPTIONAL ARIAL : FONT TYPE OF GRAPHICAL OUTPUT
XVAR_LABEL OPTIONAL : LABEL OF PREDICTOR IN GRAPHICAL OUTPUT
AUC OPTIONAL Y : Y/N. DISPLAY AREA UNDER CURVE
SN OPTIONAL Y : Y/N. DISPLAY SENSITIVITY
SP OPTIONAL Y : Y/N. DISPLAY SPECIFICITY
TA OPTIONAL : Y/N. DISPLAY TOTAL ACCURACY
YI OPTIONAL : Y/N. DISPLAY YOUDEN INDEX
NPV OPTIONAL : Y/N. DISPLAY NEGATIVE PREDICTIVE VALUE
PPV OPTIONAL : Y/N. DISPLAY POSITIVE PREDICTIVE VALUE
MCC OPTIONAL : Y/N. DISPLAY MCC
TME OPTIONAL : Y/N. DISPLAY TOTAL MISCLASSIFICATION
ERROR USING INVERSE OF TARGET GROUP
AS LOSS
TME1 OPTIONAL : Y/N. DISPLAY TOTAL MISCLASSIFICATION
ERROR USING INVERSE OF OVERALL
SAMPLE SIZE AS LOSS
TME2 OPTIONAL : Y/N. DISPLAY TOTAL MISCLASSIFICATION
ERROR USING RATIO OF LOSSES
WME OPTIONAL : Y/N. DISPLAY WEIGHTED MISCLASSIFICATION
ERROR USING INVERSE OF TARGET GROUP
AS LOSS
WME1 OPTIONAL : Y/N. DISPLAY WEIGHTED MISCLASSIFICATION
ERROR USING INVERSE OF OVERALL
SAMPLE SIZE AS LOSS
WME2 OPTIONAL : Y/N. DISPLAY WEIGHTED MISCLASSIFICATION
ERROR USING RATIO OF LOSSES
GAMMA REQUIRED IF : Y/N. POSITIVE NUMERIC VALUE OF RATIO OF
LFN/LFP
TME2 OR WME2=Y (I.E., LOSS DUE TO
MISCLASSIFICATION OF A TRUE
POSITIVE EVENT IS "GAMMA" TIMES
LARGER THAN MISCLASSIFICATION
OF A TRUE NEGATIVE EVENT)
-------------------------------------------------------------------------
NOTES :THE FOLLOWING DATASETS ARE CREATED IN THE WORK LIBRARY AND DELETED
BY THE MACRO PRIOR TO ALL CALCULATIONS.
_SNSP_EST, _SNSP_ROC1, _SNSP_AUC, _SNSP_OUTPUT.
**************************************************************************/
DATASET = , OUTCOME = , OUTCOME_LEV = , XVAR = , FONT = ARIAL,
XVAR_LABEL = , AUC = Y, SN = Y, SP = Y, TA = , YI = , NPV = ,
PPV = , MCC = , TME = , TME1 = , TME2 = , WME = , WME1 = ,
WME2 = , GAMMA = );
%PUT MACRO SNSP_TRADEOFF IS NOW EXECUTING...;
%*****************************************************************;
%* CHECKING THAT REQUIRED PARAMETERS ARE SPECIFIED *;
%*****************************************************************;
%IF %SYSFUNC(EXIST(&DATASET)) = 0 %THEN %DO;
%PUT ERROR: DATA SET &DATASET IS MISSING;
%GOTO EXIT;
%END;
%IF %LENGTH(&DATASET) = 0 %THEN %DO;
%PUT ERROR: VALUE FOR PARAMETER DATASET IS MISSING;
%GOTO EXIT;
%END;
%IF %LENGTH(&OUTCOME) = 0 %THEN %DO;
%PUT ERROR: VALUE FOR PARAMETER OUTCOME IS MISSING;
%GOTO EXIT;
%END;
%IF %LENGTH(&OUTCOME_LEV) = 0 %THEN %DO;
%PUT ERROR: VALUE FOR PARAMETER OUTCOME_LEV IS MISSING;
%GOTO EXIT;
%END;
%IF %LENGTH(&XVAR) = 0 %THEN %DO;
%PUT ERROR: VALUE FOR PARAMETER XVAR IS MISSING;
%GOTO EXIT;
%END;
%IF &TME2 = Y OR &WME2 = Y %THEN %DO;
%IF %SYSEVALF(&GAMMA <= 0) %THEN %DO;
%PUT ERROR: VALUE FOR GAMMA IS MISSING OR INCORRECT;
%GOTO EXIT;
%END;
%END;
%LOCAL COUNT;
%LOCAL CURDATA;
%LET DATASETS = _SNSP_EST _SNSP_ROC1 _SNSP_AUC _SNSP_OUTPUT;
%IF %LENGTH(&DATASETS) > 0 %THEN %DO;
%LET COUNT=1;
%LET CURDATA =%SCAN(&DATASETS,&COUNT,' ');
%DO %WHILE(&CURDATA NE);
%IF %SYSFUNC(EXIST(&CURDATA)) %THEN %DO;
PROC DATASETS NOLIST;
DELETE &CURDATA;
RUN; QUIT;
%END;
%LET COUNT=%EVAL(&COUNT+1);
%LET CURDATA =%SCAN(&DATASETS,&COUNT,' ');
%END;
%END;
%LET AUC=%UPCASE(&AUC);
%LET SN=%UPCASE(&SN);
%LET SP=%UPCASE(&SP);
%LET TA=%UPCASE(&TA);
%LET YI=%UPCASE(&YI);
%LET NPV=%UPCASE(&NPV);
%LET PPV=%UPCASE(&PPV);
%LET MCC=%UPCASE(&MCC);
%LET TME=%UPCASE(&TME);
%LET TME1=%UPCASE(&TME1);
%LET TME2=%UPCASE(&TME2);
%LET WME=%UPCASE(&WME);
%LET WME1=%UPCASE(&WME1);
%LET WME2=%UPCASE(&WME2);
%LET OUTCOME_LEV = %SYSFUNC(TRANWRD(&OUTCOME_LEV,%STR(%"),));
%LET OUTCOME_LEV = %SYSFUNC(TRANWRD(&OUTCOME_LEV,%STR(%'),));
%*****************************************************************;
%* GETTING ROC OUTPUT FROM MODEL *;
%*****************************************************************;
ODS LISTING CLOSE;
PROC LOGISTIC DATA=&DATASET OUTEST=_SNSP_EST;
MODEL &OUTCOME (EVENT = "&OUTCOME_LEV") = &XVAR/EXPB OUTROC=_SNSP_ROC1;
ODS OUTPUT ASSOCIATION=_SNSP_AUC (WHERE = (UPCASE(LABEL2) = "C"));
RUN;
ODS LISTING;
DATA _SNSP_ROC1;
SET _SNSP_ROC1; INDEX =1;
RUN;
DATA _SNSP_EST (KEEP=INTERCEPT &XVAR INDEX);
SET _SNSP_EST; INDEX =1;
RUN;
DATA _SNSP_OUTPUT;
MERGE _SNSP_ROC1 _SNSP_EST;
BY INDEX;
SPEC = 1-_1MSPEC_;
YI = (_SENSIT_ + SPEC) - 1;
N = _POS_+ _NEG_+ _FALPOS_+ _FALNEG_;
TA = (_POS_ + _NEG_)/N;
X_VALUE = (LOG(_PROB_/(1-_PROB_))-INTERCEPT)/&XVAR;
IF (_NEG_+_FALNEG_) NE 0 THEN DO;
NPV = _NEG_/(_NEG_ +_FALNEG_);
END;
IF (_POS_+_FALPOS_) NE 0 THEN DO;
PPV = _POS_/(_POS_ +_FALPOS_);
END;
IF((_POS_+_FALPOS_)*(_POS_+_FALNEG_)*(_NEG_+_FALPOS_)*(_NEG_+_FALNEG_)) NE 0
THEN DO;
MCC = ((_POS_*_NEG_)-(_FALPOS_*_FALNEG_))/SQRT(((_POS_+_FALPOS_)*(_POS_+_FALNEG_)*(_NEG_+_FALPOS_)*(_NEG_+_FALNEG_)));
END;
%*****************************************************************;
%* FOR TOTAL ERRORS *;
%*****************************************************************;
%***TARGET GROUP;
LFN = 1/(_POS_ + _FALNEG_);
LFP = 1/(_NEG_ + _FALPOS_);
TLN = _FALNEG_*LFN;
TLP = _FALPOS_*LFP;
TME = TLN+TLP;
%***EQUAL FPR/FNR;
LFN1 = 1/N;
LFP1 = 1/N;
TLN1 = _FALNEG_*LFN1;
TLP1 = _FALPOS_*LFP1;
TME1 = TLN1+TLP1;
%*****************************************************************;
%* FOR WEIGHTED ERRORS *;
%*****************************************************************;
%***TARGET GROUP;
WFN = LFN/(LFN+LFP);
WFP = LFP/(LFN+LFP);
WLN = (_FALNEG_*WFN)/N;
WPN = (_FALPOS_*WFP)/N;
WME = WLN+WPN;
%***EQUAL FPR/FNR;
WFN1 = LFN1/(LFN1+LFP1);
WFP1 = LFP1/(LFN1+LFP1);
WLN1 = (_FALNEG_*WFN1)/N;
WPN1 = (_FALPOS_*WFP1)/N;
WME1 = WLN1+WPN1;
DENSITY = 0;
RUN;
%*****************************************************************;
%* RATIO FOR TOTAL-AND-WEIGHTED ERRORS *;
%*****************************************************************;


DATA _SNSP_OUTPUT;
SET _SNSP_OUTPUT;
%IF %SYSEVALF(&GAMMA > 0) %THEN %DO;
G=&GAMMA;
TME2_NSCALE = TLN1+TLP1*(1/G);
WME2 = TLN1*(G/(1+G))+TLP1*(1/(1+G));
RUN;
PROC SQL NOPRINT;
SELECT MAX(TME2_NSCALE) INTO: MAX_TME2
FROM _SNSP_OUTPUT;
QUIT;
DATA _SNSP_OUTPUT;
SET _SNSP_OUTPUT;
TME2 = TME2_NSCALE/&MAX_TME2;
RUN;
%END;
%ELSE %DO;
TME2 = .;
WME2 = .;

%LET WME2 = N;
%LET TME2 = N;
%END;
RUN;

%IF &AUC = Y %THEN %DO;
DATA _NULL_;
SET _SNSP_AUC;
CALL SYMPUT("AUCVALUE", PUT((NVALUE2),8.3));
RUN;
%END;
%LET FIRST = / &SN/ &SP/ &TA/ &YI/ &NPV/ &PPV/ &MCC/ &TME/ &TME1/ &TME2/ &WME/ &WME1/ &WME2;
%LET LIST = SENSIT SPEC TA YI NPV PPV MCC TME TME1 TME2 WME WME1 WME2;

%DO I = 1 %TO 13;
%IF %SCAN(&FIRST, &I, "/") = Y %THEN %DO;
%LET TT&I = %SCAN(&LIST, &I, " ")*X_VALUE=%EVAL(&I+1);
%END;
%ELSE %LET TT&I = ;
%END;

TITLE "Operating characteristics for &OUTCOME = &OUTCOME_LEV";
AXIS1 W=1 OFFSET=(3 PCT) LABEL=(F=&FONT H=2 "&XVAR_LABEL");
AXIS2 W=1 OFFSET=(3 PCT) LABEL=(F=&FONT H=2 A=90 R=0 "Proportion") ORDER = (0 TO 1 BY 0.1);
LEGEND1 VALUE=(F=&FONT H=1.5) LABEL=(F=&FONT H=1.5 "Operating Characteristics:");
%IF &AUC = Y %THEN %DO;
LEGEND2 VALUE=(F=&FONT H=1.5)
LABEL=(F=&FONT H=1.5 JUSTIFY=C "Operating Characteristics:"
H=1.3 JUSTIFY=C "AUC=&AUCVALUE");
%END;
SYMBOL1 H=0.3 I=NONE W=1 C=BLACK V="|";
SYMBOL2 H=1 I=JOIN L=1 W=2C=RED;
SYMBOL3 H=1 I=JOIN L=2 W=2C=BLUE;
SYMBOL4 H=1 I=JOIN L=4 W=2C=GREEN;
SYMBOL5 H=1 I=JOIN L=6 W=2 C=PURPLE;
SYMBOL6 H=1 I=JOIN L=8 W=2 C=ORANGE;
SYMBOL7 H=1 I=JOIN L=10 W=2 C=BLACK;
SYMBOL8 H=1 I=JOIN L=12 W=2 C=RGR;
SYMBOL9 H=1 I=JOIN L=14 W=2 C=LIME;
SYMBOL10 H=1 I=JOIN L=16 W=2 C=BROWN;
SYMBOL11 H=1 I=JOIN L=22 W=2 C=PINK;
SYMBOL12 H=1 I=JOIN L=32 W=2 C=MAGENTA;
SYMBOL13 H=1 I=JOIN L=33 W=2 C=YELLOW;
SYMBOL14 H=1 I=JOIN L=24 W=2 C=VLIGB;
PROC GPLOT DATA = _SNSP_OUTPUT;
PLOT DENSITY*X_VALUE=1 &TT1 &TT2 &TT3 &TT4 &TT5 &TT6 &TT7 &TT8 &TT9 &TT10 &TT11 &TT12 &TT13 /OVERLAY VAXIS=AXIS2 HAXIS=AXIS1
%IF &AUC = Y %THEN %DO;
LEGEND=LEGEND2;
%END;
%ELSE %DO;
LEGEND=LEGEND1;
%END;
LABEL DENSITY = "Data Density" TA = "Total Accuracy" YI= "Youden Index"
SPEC = "Specificity" TME = "Total Error, Group" TME1 = "Total
Error, Overall"
TME2 = "Scaled Total Error, Ratio" WME = "Weighted Error, Group"
WME1 = "Weighted Error, Overall" WME2 = "Weighted Error, Ratio";
RUN;
QUIT;
%EXIT:
TITLE;
%MEND SNSP_TRADEOFF;

Accepted Solutions
Solution
‎03-30-2016 07:16 AM
Super Contributor
Posts: 441

Re: A Macro For Getting More Out Of Your ROC Curve: Cut off for best sensitivity Specificity trade

Hi,

 

When I run your macro as is proposed in the paper I get

 

ERROR: Variable SENSIT not found.

And that is understandable as the column is called _SENSIT_ with the surrounding underscores. The underscores are present in the paper so I assume a type of sorts is at play. For me this simple fix helped:

 

 

* %LET LIST = SENSIT SPEC TA YI NPV PPV MCC TME TME1 TME2 WME WME1 WME2;
%LET LIST = _SENSIT_ SPEC TA YI NPV PPV MCC TME TME1 TME2 WME WME1 WME2;

 

 

What helps in troubleshooting macro's  is running your code with the appropriate options. For they were

 

options mprint symbolgen;

but there are more.

 

 

Regards, Jan.

View solution in original post


All Replies
Solution
‎03-30-2016 07:16 AM
Super Contributor
Posts: 441

Re: A Macro For Getting More Out Of Your ROC Curve: Cut off for best sensitivity Specificity trade

Hi,

 

When I run your macro as is proposed in the paper I get

 

ERROR: Variable SENSIT not found.

And that is understandable as the column is called _SENSIT_ with the surrounding underscores. The underscores are present in the paper so I assume a type of sorts is at play. For me this simple fix helped:

 

 

* %LET LIST = SENSIT SPEC TA YI NPV PPV MCC TME TME1 TME2 WME WME1 WME2;
%LET LIST = _SENSIT_ SPEC TA YI NPV PPV MCC TME TME1 TME2 WME WME1 WME2;

 

 

What helps in troubleshooting macro's  is running your code with the appropriate options. For they were

 

options mprint symbolgen;

but there are more.

 

 

Regards, Jan.

Regular Contributor
Posts: 164

Re: A Macro For Getting More Out Of Your ROC Curve: Cut off for best sensitivity Specificity trade

Posted in reply to jklaverstijn

Thanks Jan, Appreciate your help

Kind regards

Am

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 451 views
  • 1 like
  • 2 in conversation