BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
elkinsbe
Obsidian | Level 7

May I request someone point me to documentation of these two data analysis options under the Exploration section:

Decision Tree - is this a chi-square ruled tree as I would assume? How are the branch points defined? is there a way to adjust?

Cluster - is this a k-means clustering? what metric does it use? are there any adjustments?

 

I wish to use one of these -- depending on what I can learn here -- don't really want to use a "black box" ….

 

Thank you to all in advance for your support and assistance -- Ben

1 ACCEPTED SOLUTION

Accepted Solutions
PetriRoine
Pyrite | Level 9

Hello @elkinsbe 

 

Here's a little information I was able to find for you.

 

Decision Tree

Implementation follows for the most part the standard C4.5 algorithm to build and prune decision tree. The primary difference between our implementation and C4.5 is the determination of the desired number of branches with the optimal variable for each splitting. 

 

Cluster

The default technique is k-means clustering. If you are using GUI you basically define inputs and number of clusters.
The default method for initializing K cluster centeroids is Forgy - you can change that to simple Random. Feature scaling is done default. 

 

Are you considering using DTree to provide clusters by giving it a target that has nothing to do with the final clusters? 

 

Best regards

Petri

View solution in original post

1 REPLY 1
PetriRoine
Pyrite | Level 9

Hello @elkinsbe 

 

Here's a little information I was able to find for you.

 

Decision Tree

Implementation follows for the most part the standard C4.5 algorithm to build and prune decision tree. The primary difference between our implementation and C4.5 is the determination of the desired number of branches with the optimal variable for each splitting. 

 

Cluster

The default technique is k-means clustering. If you are using GUI you basically define inputs and number of clusters.
The default method for initializing K cluster centeroids is Forgy - you can change that to simple Random. Feature scaling is done default. 

 

Are you considering using DTree to provide clusters by giving it a target that has nothing to do with the final clusters? 

 

Best regards

Petri

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Tips for filtering data sources in SAS Visual Analytics

See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 344 views
  • 0 likes
  • 2 in conversation