BookmarkSubscribeRSS Feed
R_A_G_
Calcite | Level 5

Hello,

I have 3 questions:

1. how do I interpret the tree diagram ?

2. How can I specify the number of clusters I want

3. what is the code for K-means?

Thank you

7 REPLIES 7
Ksharp
Super User

1. how do I interpret the tree diagram ?

Tree only display the correlation(or distance) between nodes .

2. How can I specify the number of clusters I want

No. It is hard .You need to read more documentaion.

Or Using Component Analysis to help you decide how many clusters  you need.

3. what is the code for K-means?

There are three distance definited in proc clus . k-means is one of them.

if i don't make a mistake, i remember k-means is the MEAN of each members of a cluster.

Ksharp

R_A_G_
Calcite | Level 5

So there is no simple code to use for cluster analysis and specify the number of clusters I want?

FriedEgg
SAS Employee

PROC FASTCLUS and MODECLUS have a MAXCLUSTERS option that enables you to in some respect specify the number of clusters you want.  PROC VARCLUS has a MIN and MAXCLUSTERS options as well.  It depends what type of cluster analysis you intend to perform.

R_A_G_
Calcite | Level 5

I have a set of data and am trying to find some sort of order, pattern in it and thought cluster analysis would be a good option. I did attempt the explanatory factor analysis which did not work. Could you please give me a sample code

thank you

FriedEgg
SAS Employee

The documention for every procedure comes with a number of useful examples:

Cluster Procedure:

http://support.sas.com/documentation/cdl/en/statug/65328/HTML/default/viewer.htm#statug_cluster_exam...

ballardw
Super User

FASTCLUS does allow setting the number of clusters. However, it will force the data to create exactly that many clusters, even if one cluster consists of one record.

The online help shows an example of using a varety of standarization methods followed by a call to FASTCLUS and print to see how well the clusters matched known categories. I found that very helpful.

I've used proc print with likely combinations of categorical variables (list option is your friend ) to id characteristics of the resulting clusters.

Ksharp
Super User

No. The code for Cluster is very simple . You can take suggestion from FriedEgg . Check the documentation, there are already lots of sample code you can reference .

The number of cluster is hard to decide , but you can specify it by yourself . 2 or 4 or 6 or anything else.

Component Analysis can help you understand the pattern of data which can help you decide which number of cluster is the best.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1284 views
  • 0 likes
  • 4 in conversation