Hello SAS community,
I am using PROC HPSPLIT to create a binary classification tree. It mostly seems to run fine, except for some reason it is not showing me the model sensitivity and specificity in the output, even though I do get an ROC plot and confusion matrix. I have specified the EVENT= option in the MODEL statement, which should trigger the calculation of sensitivity and specificity. Any suggestions?
My (non-reproducible) code is below. The outcome, COUNT, is a numeric variable with discrete integer values 1 through 6. The format "count_f" supplied to PROC HPSPLIT will force it to be binary. The sample size is approximately 10 thousand observations. However, there is missing data, which reduces the sample size to approximately 7,000.
Session information: SAS 9.4 TS Level 1M7
PROC FORMAT;
value count_f
1 = "1 organ"
2 - high = "> 1 organ";
RUN;
ODS GRAPHICS ON;
PROC HPSPLIT data=my_data CVMODELFIT seed=102122;
CLASS COUNT X1 X2 X3 X4 X5;
MODEL COUNT(event='> 1 organ') = X1 X2 X3 X4 X5;
format COUNT count_f.;
GROW entropy;
PRUNE costcomplexity;
RUN;
The log is indicating that the procedure has correctly recognized my outcome as binary with the intended event category:
NOTE: PROC HPSPLIT is modeling the event COUNT=> 1 organ for sensitivity, specificity, AUC, and ROC curve calculations
Thanks for any insight you can share!

Hello,
That is very strange.
The way you have set it up is correct.
Like this code, with dummy data (only 19 records and 17 having the event), proves :
( I used the same SAS : SAS 9.4 TS Level 1M7 )
PROC FORMAT;
value count_f
11 = "1 organ"
12 - high = "> 1 organ";
RUN;
ODS GRAPHICS ON;
PROC HPSPLIT data=sashelp.class CVMODELFIT seed=102122;
id name;
CLASS age sex;
MODEL age(event='> 1 organ') = sex height weight;
format age count_f.;
GROW entropy;
PRUNE costcomplexity;
RUN;
/* end of program */
I would open a ticket at SAS Technical Support in your country !
Koen
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.