BookmarkSubscribeRSS Feed
Joel_Wesley
Fluorite | Level 6

Hello,

I'm using SAS Enterprise Miner 12.1.

I'm finding it useful to get the counts of events/non-events (binary target modeling) in the Posterior Probability ranges (0.95-1.00, 0.90-0.95, etc.)  This is provided in the Output page in the Results, when right-clicking on the model node that's already been run.  Although these counts are provided in a table for TRAIN and VALIDATE data sets at the bottom of the Output page, they'reenter not given for the TEST data set.  I'm wondering why that's the case, and can it be obtained without having to score the TEST set and bucket the groups?

Thanks,

Joel

1 REPLY 1
DougWielenga
SAS Employee

The role of TRAINING, VALIDATE, and TEST data sets can vary across different software applications.  In SAS Enterprise Miner, the TRAINING data set is typically used to build candidate models, the VALIDATE data set is typically used to choose the best model among those fit to the TRAINING data set, and the TEST data set is intended to provide a final unbiased assessment of how the model performs on holdout data which was not used in building or choosing the model.  For this reason, certain assessment information was suppressed on the TEST data set to discourage having it be used as a secondary validation data set.  

 

The Score Rankings Overlay chart and the ROC chart are created for the TEST data a well (when present) and you can view the underlying table by clicking on the desired chart in the Model Comparison node results and then clicking on View --> Table in order to view the statistics that were used to create the charts.  You would need to analyze the TEST data set separately to obtain some of the other metrics which are only available for TRAINING and VALIDATE.

 

I hope this helps!

Doug

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 836 views
  • 0 likes
  • 2 in conversation