BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ggfggrr
Obsidian | Level 7

Is the following sequence of connecting the Traing/Validation/Test  data sets to Interactive Grouping is right way to do while building the scorecard?

SAS Code - Splits the data set into Training and Validation

Risk_features_test - is the Test data set

 

I am wondering for the reason that, since test dataset is also used for interactive grouping, whether this will cause false high performance on the test data set.

 

Thanks

Test dataset to Interactive Grouping.PNG

1 ACCEPTED SOLUTION

Accepted Solutions
WendyCzika
SAS Employee

Assuming you have the role for your Test data set set to Test, then yes this should be correct.

 

The IGN node doesn't group based on the test data, so no worries about overfitting to that.

 

View solution in original post

3 REPLIES 3
WendyCzika
SAS Employee

Assuming you have the role for your Test data set set to Test, then yes this should be correct.

 

The IGN node doesn't group based on the test data, so no worries about overfitting to that.

 

ggfggrr
Obsidian | Level 7

Thank you so much Wendy, it is really a motivating help in my learning.

 

Kind regards,

Mari

busrafenerci1
Calcite | Level 5

Hello,

 

I am wondering validation data is used for interactive grouping? If it is not, how can I force to use validation set also.

 

Thanks,

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1853 views
  • 1 like
  • 3 in conversation