Hi. Sorry if this seemed to be a stupid question, but I just wanted to get some clarification on the Role values available for the File Import node. "Train" and "Validate", I understand them, but I would like to know the difference between "Test" and "Score" role.
This is in relation to a decision tree that I made, basically i have a file import node set to train role and connected to a decision tree, then that decision tree is connected to a score node. Then I have another data set (that I dont know how to set the role for, is it test? or is it score?) that is on a file import node that will be connected to the same score node. I asked my professor about it, but the answer given to me was, he is used to using score than test role. I just don't buy his answer without an exact explanation.
If your other data set has the target in it, and you are using it as a hold-out data set for assessing your model, then the role should be Test. If your data does not contain the target and you want to score it to get predictions, then the role should be Score.
If your other data set has the target in it, and you are using it as a hold-out data set for assessing your model, then the role should be Test. If your data does not contain the target and you want to score it to get predictions, then the role should be Score.
Typically holdout and test mean the same thing - since the validation partition can be used in training your model (for pruning, early stopping, etc.), you keep the test or holdout for final assessment, based on data not used in any part of training your model.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.