BookmarkSubscribeRSS Feed
lionking19063
Fluorite | Level 6

Hi,

I am doing a cluster analysis with 10 continuous variables and 3 categorical variables. Instead of converting categorical variables into dummies, I am thinking of creating distance matrix using "PROC DISTANCE".

1) Calculate 3 sets of distance matrix and each set contains the distance between one categorical variable(id category_var1) and 10 continuous variables(var interval(continuous _var1-continuous10) 

2) then merge 3 sets of distance matrix back with the values of 10 continuous variables

3) Standardize them and use standardized variables as the new variables in "PROC CLUSTER" or "PROC FASTCLUS"

 

Question, Dose the logic make sense to you, particularly step 1 ? Thank you.

2 REPLIES 2
PGStats
Opal | Level 21

Instead, you could get clusters from continuous_var1-continuous_var10 and test for a relationship between those clusters and your categories with proc freq.

PG
lionking19063
Fluorite | Level 6

You are right. However, I really want to test the effects of categorical variables along with other continuous variables at the same time. Thank you for your response.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1166 views
  • 0 likes
  • 2 in conversation