BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Maynn__
Calcite | Level 5

Hello everyone,

 

I am learning about Data Mining as part of my university course and I need to look into Clustering and Decision Trees.

 

I've noticed that you can obtain a decision tree from the cluster node results (Cluster Profile > Tree) and I was wondering what are the advantages of using this over a regular Decision Tree node? I noticed a few disadvantages but apart from it being a lot simpler, I can't see any advantages.

 

Any insight into this area would be much appreciated. I have looked through the guides and there is plenty of information on the Decision Tree node but not so much on the specifics of Cluster Profile's Tree. Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
CraigDeVault
SAS Employee

The Cluster Profile Tree that you see in the output looks for the variables that are significant and discriminating between all of the clusters that are generated.  It does use the same underlying procedure that the Decision Tree node uses, but the user has no control over the options specified in the tree.  The Cluster Profile Tree is simply another way of looking at the clusters that the node creates. 

 

The Decision Tree node allows the user to modify many properties that are important in creating the shape of the tree.  You can modify the statistic that is used as the criterion for a splitting variable, the size of the tree, the methodology for dealing with missing values, minimum size of a leaf, use of assessment measures, etc.  One of the most popular features of the node is that you can interactive go into the tree and manually create the splits that are best suited to your problem, whether the splits are based in business rules, historical processes, or a professor says that he wants to see a certain variable as the initial split.

 

If you follow the Cluster node with a Decision Tree node, you can replicate the Cluster Profile Tree if we set up the same properties in the Decision Tree node.  However, the Cluster Profile Tree is a quick snapshot of the clusters in a tree format while the Decision Tree node provides the user with a plethora of properties to maximum the value of modeling a tree.

View solution in original post

2 REPLIES 2
CraigDeVault
SAS Employee

The Cluster Profile Tree that you see in the output looks for the variables that are significant and discriminating between all of the clusters that are generated.  It does use the same underlying procedure that the Decision Tree node uses, but the user has no control over the options specified in the tree.  The Cluster Profile Tree is simply another way of looking at the clusters that the node creates. 

 

The Decision Tree node allows the user to modify many properties that are important in creating the shape of the tree.  You can modify the statistic that is used as the criterion for a splitting variable, the size of the tree, the methodology for dealing with missing values, minimum size of a leaf, use of assessment measures, etc.  One of the most popular features of the node is that you can interactive go into the tree and manually create the splits that are best suited to your problem, whether the splits are based in business rules, historical processes, or a professor says that he wants to see a certain variable as the initial split.

 

If you follow the Cluster node with a Decision Tree node, you can replicate the Cluster Profile Tree if we set up the same properties in the Decision Tree node.  However, the Cluster Profile Tree is a quick snapshot of the clusters in a tree format while the Decision Tree node provides the user with a plethora of properties to maximum the value of modeling a tree.

art297
Opal | Level 21

Here is one link that describes the cluster approach: http://www2.sas.com/proceedings/forum2008/154-2008.pdf

 

It is one way to reduce the number of variables on which the tree will be based.

 

Art, CEO, AnalystFinder.com

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1800 views
  • 1 like
  • 3 in conversation