I've created a random forest model on SAS Enterprise Miner 14.1 and I'm trying to use the same model to score an out-of-time data to see if my model is consistent.
However, when I created a score node and then a model comparison node to check the KS and AUROC statistics that I got on my out-of-time model, I'm getting the same stats on both! Please take a look at my project flow below:
TLDR: I'm getting the same results for the node Model Comparison(5) (the last one at the flow) and the node Model Comparison(3). Why's that happening?
I used the Data Partition node to divide my data in three sets: 50% to train, 30% to validate and 20% to test. So yes, I'm also creating a Test partition with the Data Partition node.
I exported the scored database (by using a Save Data node) to SAS Enterprise and used a macro to calculate the KS and ROC index there, problem solved!
However, the easiest way to do this is as mentioned earlier by another user: using the database as a test base on your modelling flow.
Thanks for the help!
Are you creating a Test partition with the Data Partition node? If not, then you can do this: make sure the role of the Input Data node for your out-of-time data is set to Test (assuming you have the target in that data), then connect that directly to the HP Forest node. You don't need the Score node and second Model Comparison node; when you re-run the first MC node, it should now include assessment on this test or hold-out data.
If you are creating a Test partition with the Data Partition node, there still might be a way to do it that I'm not thinking of yet...
I used the Data Partition node to divide my data in three sets: 50% to train, 30% to validate and 20% to test. So yes, I'm also creating a Test partition with the Data Partition node.
I exported the scored database (by using a Save Data node) to SAS Enterprise and used a macro to calculate the KS and ROC index there, problem solved!
However, the easiest way to do this is as mentioned earlier by another user: using the database as a test base on your modelling flow.
Thanks for the help!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.