11-03-2014 02:00 PM
;I am new to clustering. In the project I've been working on I ran a PCA of several test variables. I came up with 5 factors/components with eigenvalues greater than 1. I standardized the variables that loaded onto these components and I am now clustering. Based on the results that Proc Cluster gave me (ccc, pseudo ect.) I determined that I want 5 clusters as well. Now I am trying to pick out "defining" characteristics of each of those 5 clusters and examine their make up (what race, gender, age) group makes up these clusters. To do this I am trying to incorporate a new variable into my datset which assigns each participant (by ID) a number between 1-5 indicating what cluster that participant belongs to. However, all the guides I find on the internet only tell me how to determine the optimal # of clusters. I already know how many clusters I want, I just need to create this variable that tells me what participant goes into what cluster. What is the best way to do this?