Is the following sequence of connecting the Traing/Validation/Test data sets to Interactive Grouping is right way to do while building the scorecard?
SAS Code - Splits the data set into Training and Validation
Risk_features_test - is the Test data set
I am wondering for the reason that, since test dataset is also used for interactive grouping, whether this will cause false high performance on the test data set.
Thanks
Assuming you have the role for your Test data set set to Test, then yes this should be correct.
The IGN node doesn't group based on the test data, so no worries about overfitting to that.
Assuming you have the role for your Test data set set to Test, then yes this should be correct.
The IGN node doesn't group based on the test data, so no worries about overfitting to that.
Thank you so much Wendy, it is really a motivating help in my learning.
Kind regards,
Mari
Hello,
I am wondering validation data is used for interactive grouping? If it is not, how can I force to use validation set also.
Thanks,
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.