Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Decision tree in SAS

Accepted Solution Solved
Reply
Learner
Posts: 1
Accepted Solution

Decision tree in SAS

Hello everyone,

 

I am learning about Data Mining as part of my university course and I need to look into Clustering and Decision Trees.

 

I've noticed that you can obtain a decision tree from the cluster node results (Cluster Profile > Tree) and I was wondering what are the advantages of using this over a regular Decision Tree node? I noticed a few disadvantages but apart from it being a lot simpler, I can't see any advantages.

 

Any insight into this area would be much appreciated. I have looked through the guides and there is plenty of information on the Decision Tree node but not so much on the specifics of Cluster Profile's Tree. Thanks!


Accepted Solutions
Solution
‎03-28-2017 11:08 AM
SAS Employee
Posts: 21

Re: Decision tree in SAS

The Cluster Profile Tree that you see in the output looks for the variables that are significant and discriminating between all of the clusters that are generated.  It does use the same underlying procedure that the Decision Tree node uses, but the user has no control over the options specified in the tree.  The Cluster Profile Tree is simply another way of looking at the clusters that the node creates. 

 

The Decision Tree node allows the user to modify many properties that are important in creating the shape of the tree.  You can modify the statistic that is used as the criterion for a splitting variable, the size of the tree, the methodology for dealing with missing values, minimum size of a leaf, use of assessment measures, etc.  One of the most popular features of the node is that you can interactive go into the tree and manually create the splits that are best suited to your problem, whether the splits are based in business rules, historical processes, or a professor says that he wants to see a certain variable as the initial split.

 

If you follow the Cluster node with a Decision Tree node, you can replicate the Cluster Profile Tree if we set up the same properties in the Decision Tree node.  However, the Cluster Profile Tree is a quick snapshot of the clusters in a tree format while the Decision Tree node provides the user with a plethora of properties to maximum the value of modeling a tree.

View solution in original post


All Replies
Solution
‎03-28-2017 11:08 AM
SAS Employee
Posts: 21

Re: Decision tree in SAS

The Cluster Profile Tree that you see in the output looks for the variables that are significant and discriminating between all of the clusters that are generated.  It does use the same underlying procedure that the Decision Tree node uses, but the user has no control over the options specified in the tree.  The Cluster Profile Tree is simply another way of looking at the clusters that the node creates. 

 

The Decision Tree node allows the user to modify many properties that are important in creating the shape of the tree.  You can modify the statistic that is used as the criterion for a splitting variable, the size of the tree, the methodology for dealing with missing values, minimum size of a leaf, use of assessment measures, etc.  One of the most popular features of the node is that you can interactive go into the tree and manually create the splits that are best suited to your problem, whether the splits are based in business rules, historical processes, or a professor says that he wants to see a certain variable as the initial split.

 

If you follow the Cluster node with a Decision Tree node, you can replicate the Cluster Profile Tree if we set up the same properties in the Decision Tree node.  However, the Cluster Profile Tree is a quick snapshot of the clusters in a tree format while the Decision Tree node provides the user with a plethora of properties to maximum the value of modeling a tree.

PROC Star
Posts: 7,468

Re: Decision tree in SAS

Here is one link that describes the cluster approach: http://www2.sas.com/proceedings/forum2008/154-2008.pdf

 

It is one way to reduce the number of variables on which the tree will be based.

 

Art, CEO, AnalystFinder.com

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 523 views
  • 1 like
  • 3 in conversation