BookmarkSubscribeRSS Feed
Joel_Wesley
Fluorite | Level 6

Hello,

I'm using SAS Enterprise Miner 12.1.

I'm finding it useful to get the counts of events/non-events (binary target modeling) in the Posterior Probability ranges (0.95-1.00, 0.90-0.95, etc.)  This is provided in the Output page in the Results, when right-clicking on the model node that's already been run.  Although these counts are provided in a table for TRAIN and VALIDATE data sets at the bottom of the Output page, they'reenter not given for the TEST data set.  I'm wondering why that's the case, and can it be obtained without having to score the TEST set and bucket the groups?

Thanks,

Joel

1 REPLY 1
DougWielenga
SAS Employee

The role of TRAINING, VALIDATE, and TEST data sets can vary across different software applications.  In SAS Enterprise Miner, the TRAINING data set is typically used to build candidate models, the VALIDATE data set is typically used to choose the best model among those fit to the TRAINING data set, and the TEST data set is intended to provide a final unbiased assessment of how the model performs on holdout data which was not used in building or choosing the model.  For this reason, certain assessment information was suppressed on the TEST data set to discourage having it be used as a secondary validation data set.  

 

The Score Rankings Overlay chart and the ROC chart are created for the TEST data a well (when present) and you can view the underlying table by clicking on the desired chart in the Model Comparison node results and then clicking on View --> Table in order to view the statistics that were used to create the charts.  You would need to analyze the TEST data set separately to obtain some of the other metrics which are only available for TRAINING and VALIDATE.

 

I hope this helps!

Doug

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 882 views
  • 0 likes
  • 2 in conversation