Is the following sequence of connecting the Traing/Validation/Test data sets to Interactive Grouping is right way to do while building the scorecard?
SAS Code - Splits the data set into Training and Validation
Risk_features_test - is the Test data set
I am wondering for the reason that, since test dataset is also used for interactive grouping, whether this will cause false high performance on the test data set.
Thanks
Assuming you have the role for your Test data set set to Test, then yes this should be correct.
The IGN node doesn't group based on the test data, so no worries about overfitting to that.
Assuming you have the role for your Test data set set to Test, then yes this should be correct.
The IGN node doesn't group based on the test data, so no worries about overfitting to that.
Thank you so much Wendy, it is really a motivating help in my learning.
Kind regards,
Mari
Hello,
I am wondering validation data is used for interactive grouping? If it is not, how can I force to use validation set also.
Thanks,
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.