BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
elkinsbe
Obsidian | Level 7

May I request someone point me to documentation of these two data analysis options under the Exploration section:

Decision Tree - is this a chi-square ruled tree as I would assume? How are the branch points defined? is there a way to adjust?

Cluster - is this a k-means clustering? what metric does it use? are there any adjustments?

 

I wish to use one of these -- depending on what I can learn here -- don't really want to use a "black box" ….

 

Thank you to all in advance for your support and assistance -- Ben

1 ACCEPTED SOLUTION

Accepted Solutions
PetriRoine
Pyrite | Level 9

Hello @elkinsbe 

 

Here's a little information I was able to find for you.

 

Decision Tree

Implementation follows for the most part the standard C4.5 algorithm to build and prune decision tree. The primary difference between our implementation and C4.5 is the determination of the desired number of branches with the optimal variable for each splitting. 

 

Cluster

The default technique is k-means clustering. If you are using GUI you basically define inputs and number of clusters.
The default method for initializing K cluster centeroids is Forgy - you can change that to simple Random. Feature scaling is done default. 

 

Are you considering using DTree to provide clusters by giving it a target that has nothing to do with the final clusters? 

 

Best regards

Petri

View solution in original post

1 REPLY 1
PetriRoine
Pyrite | Level 9

Hello @elkinsbe 

 

Here's a little information I was able to find for you.

 

Decision Tree

Implementation follows for the most part the standard C4.5 algorithm to build and prune decision tree. The primary difference between our implementation and C4.5 is the determination of the desired number of branches with the optimal variable for each splitting. 

 

Cluster

The default technique is k-means clustering. If you are using GUI you basically define inputs and number of clusters.
The default method for initializing K cluster centeroids is Forgy - you can change that to simple Random. Feature scaling is done default. 

 

Are you considering using DTree to provide clusters by giving it a target that has nothing to do with the final clusters? 

 

Best regards

Petri

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 349 views
  • 0 likes
  • 2 in conversation