May I request someone point me to documentation of these two data analysis options under the Exploration section:
Decision Tree - is this a chi-square ruled tree as I would assume? How are the branch points defined? is there a way to adjust?
Cluster - is this a k-means clustering? what metric does it use? are there any adjustments?
I wish to use one of these -- depending on what I can learn here -- don't really want to use a "black box" ….
Thank you to all in advance for your support and assistance -- Ben
Hello @elkinsbe
Here's a little information I was able to find for you.
Decision Tree
Implementation follows for the most part the standard C4.5 algorithm to build and prune decision tree. The primary difference between our implementation and C4.5 is the determination of the desired number of branches with the optimal variable for each splitting.
Cluster
The default technique is k-means clustering. If you are using GUI you basically define inputs and number of clusters.
The default method for initializing K cluster centeroids is Forgy - you can change that to simple Random. Feature scaling is done default.
Are you considering using DTree to provide clusters by giving it a target that has nothing to do with the final clusters?
Best regards
Petri
Hello @elkinsbe
Here's a little information I was able to find for you.
Decision Tree
Implementation follows for the most part the standard C4.5 algorithm to build and prune decision tree. The primary difference between our implementation and C4.5 is the determination of the desired number of branches with the optimal variable for each splitting.
Cluster
The default technique is k-means clustering. If you are using GUI you basically define inputs and number of clusters.
The default method for initializing K cluster centeroids is Forgy - you can change that to simple Random. Feature scaling is done default.
Are you considering using DTree to provide clusters by giving it a target that has nothing to do with the final clusters?
Best regards
Petri
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.
Find more tutorials on the SAS Users YouTube channel.