BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Rovere
Calcite | Level 5

I've created a random forest model on SAS Enterprise Miner 14.1  and I'm trying to use the same model to score an out-of-time data to see if my model is consistent.

 

However, when I created a score node and then a model comparison node to check the KS and AUROC statistics that I got on my out-of-time model, I'm getting the same stats on both! Please take a look at my project flow below:

 

SAS.PNG                                                   

 

TLDR: I'm getting the same results for the node Model Comparison(5) (the last one at the flow) and the node Model Comparison(3). Why's that happening?

1 ACCEPTED SOLUTION

Accepted Solutions
Rovere
Calcite | Level 5

I used the Data Partition node to divide my data in three sets: 50% to train, 30% to validate and 20% to test. So yes, I'm also creating a Test partition with the Data Partition node.

 

I exported the scored database (by using a Save Data node) to SAS Enterprise and used a macro to calculate the KS and ROC index there, problem solved!

 

However, the easiest way to do this is as mentioned earlier by another user: using the database as a test base on your modelling flow.

 

Thanks for the help!

View solution in original post

2 REPLIES 2
WendyCzika
SAS Employee

Are you creating a Test partition with the Data Partition node?  If not, then you can do this: make sure the role of the Input Data node for your out-of-time data is set to Test (assuming you have the target in that data), then connect that directly to the HP Forest node.  You don't need the Score node and second Model Comparison node; when you re-run the first MC node, it should now include assessment on this test or hold-out data.

 

If you are creating a Test partition with the Data Partition node, there still might be a way to do it that I'm not thinking of yet...

 

Rovere
Calcite | Level 5

I used the Data Partition node to divide my data in three sets: 50% to train, 30% to validate and 20% to test. So yes, I'm also creating a Test partition with the Data Partition node.

 

I exported the scored database (by using a Save Data node) to SAS Enterprise and used a macro to calculate the KS and ROC index there, problem solved!

 

However, the easiest way to do this is as mentioned earlier by another user: using the database as a test base on your modelling flow.

 

Thanks for the help!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 940 views
  • 1 like
  • 2 in conversation