BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
elkinsbe
Obsidian | Level 7

May I request someone point me to documentation of these two data analysis options under the Exploration section:

Decision Tree - is this a chi-square ruled tree as I would assume? How are the branch points defined? is there a way to adjust?

Cluster - is this a k-means clustering? what metric does it use? are there any adjustments?

 

I wish to use one of these -- depending on what I can learn here -- don't really want to use a "black box" ….

 

Thank you to all in advance for your support and assistance -- Ben

1 ACCEPTED SOLUTION

Accepted Solutions
PetriRoine
Pyrite | Level 9

Hello @elkinsbe 

 

Here's a little information I was able to find for you.

 

Decision Tree

Implementation follows for the most part the standard C4.5 algorithm to build and prune decision tree. The primary difference between our implementation and C4.5 is the determination of the desired number of branches with the optimal variable for each splitting. 

 

Cluster

The default technique is k-means clustering. If you are using GUI you basically define inputs and number of clusters.
The default method for initializing K cluster centeroids is Forgy - you can change that to simple Random. Feature scaling is done default. 

 

Are you considering using DTree to provide clusters by giving it a target that has nothing to do with the final clusters? 

 

Best regards

Petri

View solution in original post

1 REPLY 1
PetriRoine
Pyrite | Level 9

Hello @elkinsbe 

 

Here's a little information I was able to find for you.

 

Decision Tree

Implementation follows for the most part the standard C4.5 algorithm to build and prune decision tree. The primary difference between our implementation and C4.5 is the determination of the desired number of branches with the optimal variable for each splitting. 

 

Cluster

The default technique is k-means clustering. If you are using GUI you basically define inputs and number of clusters.
The default method for initializing K cluster centeroids is Forgy - you can change that to simple Random. Feature scaling is done default. 

 

Are you considering using DTree to provide clusters by giving it a target that has nothing to do with the final clusters? 

 

Best regards

Petri

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 504 views
  • 0 likes
  • 2 in conversation