BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Slash
Quartz | Level 8

Hi, guys

 

About this statement "Subtree selection using validation data can be eliminated by choosing Method=Largest or Method=N in the Method property“,  I can't understand why it is false. 

 

捕获.JPG

1 ACCEPTED SOLUTION

Accepted Solutions
DougWielenga
SAS Employee

About this statement "Subtree selection using validation data can be eliminated by choosing Method=Largest or Method=N in the Method property“,  I can't understand why it is false. 

 

 

I not sure where you found that statement or what you mean by "it is false".    I searched for the word 'eliminated' in the SAS Enterprise Miner help and did not find any references in the Decision Tree node.    

 

For Method=Largest, it is only based on the training data per the Decision Tree node help.

If the Method property under Subtree grouping is set to Largest, then the Decision Tree node uses the largest subtree after it prunes the nodes that do not increase the assessment based on the training data.

 

For Method=N, the help indicates it selects the smallest subtree with the best assessment value so it depends on which assessment value is chosen.   You will not necessarily get the same tree in this situation for the same training data but different validation data sets (or if no validation data is present).

 

If I have  misunderstood your question, please let me know.  Also, it would be helpful to know where the statement is made as it might be in error.

 

Hope this helps!

Doug

 

 

 

 

View solution in original post

2 REPLIES 2
DougWielenga
SAS Employee

About this statement "Subtree selection using validation data can be eliminated by choosing Method=Largest or Method=N in the Method property“,  I can't understand why it is false. 

 

 

I not sure where you found that statement or what you mean by "it is false".    I searched for the word 'eliminated' in the SAS Enterprise Miner help and did not find any references in the Decision Tree node.    

 

For Method=Largest, it is only based on the training data per the Decision Tree node help.

If the Method property under Subtree grouping is set to Largest, then the Decision Tree node uses the largest subtree after it prunes the nodes that do not increase the assessment based on the training data.

 

For Method=N, the help indicates it selects the smallest subtree with the best assessment value so it depends on which assessment value is chosen.   You will not necessarily get the same tree in this situation for the same training data but different validation data sets (or if no validation data is present).

 

If I have  misunderstood your question, please let me know.  Also, it would be helpful to know where the statement is made as it might be in error.

 

Hope this helps!

Doug

 

 

 

 

Slash
Quartz | Level 8

Thanks, I got it. This statement is from a Multiple Choice Poll about Decison Tree Node in SAS EM. I'm learning it.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1169 views
  • 0 likes
  • 2 in conversation