BookmarkSubscribeRSS Feed
biaf08
Calcite | Level 5

Hi,

 

I am fairly new to SAS and could use some help. I have been working on customer segmentation on a 30 000 entries data set with mixed variables (continuos, binary and categorical).

 

I initially was thinking of turning some of the categorical variables with few levels into binary variables and then use PROC FASTCLUS on it. However it seems like FASTCLUS only performs k-means, which is not appropriate for binary variables. I then used PROC DISTANCE to create a gower's distance matrix directly for the mixed variables data set to feed into PROC CLUSTER. But now I am getting an error  and warning:

 

WARNING: Unable to allocate sufficient memory. Amount requested was 0, amount available was 1691620352...

ERROR: Invalid position -2147479016 for utility file WORK.'SASTMP-000000029'n.UTILITY

 

Am I doing something wrong or is my data set to large to be processed with PROC CLUSTER. Are there any alternative ways to cluster mixed variables ds?

 

Thanks 🙂

1 REPLY 1
Ksharp
Super User

You could directly use PROC CLUSTER, if you already have design matrix like:

 

sex

F

F

M

 

-->

F M

1 0

1 0

0 1

sas-innovate-white.png

Missed SAS Innovate in Orlando?

Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.

 

Register now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1482 views
  • 0 likes
  • 2 in conversation