BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Maynn__
Calcite | Level 5

Hello everyone,

 

I am learning about Data Mining as part of my university course and I need to look into Clustering and Decision Trees.

 

I've noticed that you can obtain a decision tree from the cluster node results (Cluster Profile > Tree) and I was wondering what are the advantages of using this over a regular Decision Tree node? I noticed a few disadvantages but apart from it being a lot simpler, I can't see any advantages.

 

Any insight into this area would be much appreciated. I have looked through the guides and there is plenty of information on the Decision Tree node but not so much on the specifics of Cluster Profile's Tree. Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
CraigDeVault
SAS Employee

The Cluster Profile Tree that you see in the output looks for the variables that are significant and discriminating between all of the clusters that are generated.  It does use the same underlying procedure that the Decision Tree node uses, but the user has no control over the options specified in the tree.  The Cluster Profile Tree is simply another way of looking at the clusters that the node creates. 

 

The Decision Tree node allows the user to modify many properties that are important in creating the shape of the tree.  You can modify the statistic that is used as the criterion for a splitting variable, the size of the tree, the methodology for dealing with missing values, minimum size of a leaf, use of assessment measures, etc.  One of the most popular features of the node is that you can interactive go into the tree and manually create the splits that are best suited to your problem, whether the splits are based in business rules, historical processes, or a professor says that he wants to see a certain variable as the initial split.

 

If you follow the Cluster node with a Decision Tree node, you can replicate the Cluster Profile Tree if we set up the same properties in the Decision Tree node.  However, the Cluster Profile Tree is a quick snapshot of the clusters in a tree format while the Decision Tree node provides the user with a plethora of properties to maximum the value of modeling a tree.

View solution in original post

2 REPLIES 2
CraigDeVault
SAS Employee

The Cluster Profile Tree that you see in the output looks for the variables that are significant and discriminating between all of the clusters that are generated.  It does use the same underlying procedure that the Decision Tree node uses, but the user has no control over the options specified in the tree.  The Cluster Profile Tree is simply another way of looking at the clusters that the node creates. 

 

The Decision Tree node allows the user to modify many properties that are important in creating the shape of the tree.  You can modify the statistic that is used as the criterion for a splitting variable, the size of the tree, the methodology for dealing with missing values, minimum size of a leaf, use of assessment measures, etc.  One of the most popular features of the node is that you can interactive go into the tree and manually create the splits that are best suited to your problem, whether the splits are based in business rules, historical processes, or a professor says that he wants to see a certain variable as the initial split.

 

If you follow the Cluster node with a Decision Tree node, you can replicate the Cluster Profile Tree if we set up the same properties in the Decision Tree node.  However, the Cluster Profile Tree is a quick snapshot of the clusters in a tree format while the Decision Tree node provides the user with a plethora of properties to maximum the value of modeling a tree.

art297
Opal | Level 21

Here is one link that describes the cluster approach: http://www2.sas.com/proceedings/forum2008/154-2008.pdf

 

It is one way to reduce the number of variables on which the tree will be based.

 

Art, CEO, AnalystFinder.com

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1824 views
  • 1 like
  • 3 in conversation