BookmarkSubscribeRSS Feed
deega
Quartz | Level 8

Hi,

 

I am trying to do segmentation using fastclus procedure but most of the data points fall into just one cluster. My input is a utility matrix that I created by customer purchase history and customer's demographic data. Could anybody please tell me whats wrong...

My input looks as follows

S.NoPromotionIDFRTotal_PreturnA1A2A3A4A5A6A7A8A9A10A11A12A13A14A15A16A17A18A19A20Other_AM1M2M3M4Other_MO1O2O3O4O5O6O7O8O9O10RegionAgeGender
104-0.23831810010000000000000000000100000000100000-0.954720.3832481
104-0.23831820000200000000000000000200000002000000-1.113970.9952251
104-0.238317100000000001000000000000100000000000010.080419-0.228732
104-0.23831820000000200000000000000200000000200000-0.71584-0.228731
1042.1363831810100000000000000000000100000100000000-0.63622-0.228732
104-0.238315100000000000000000000001000000001000000.6378010.3832482
104-0.238318100000000000100000000001000000001000000.080419-0.228731
104-0.238318300000000300000000000000300000003000000.2396710.9952252
104-0.238318200000000020000000000000200000002000000.5581750.3832481
104-0.238328300020000000000100000002100000002000011.274809-0.228732
104-0.23831812000000010000000000000100000000000001-0.63622-0.840712
104-0.238313100000000001000000000000100000001000001.513687-0.840711
104-0.23831810000100000000000000000100000000100000-0.63622-0.840711
104-0.238313100000100000000000000000010000000000100.3192970.3832482
10 REPLIES 10
Ksharp
Super User

I saw your data is category data not continuous data.

Firstly use PROC DISTANCE to get these category data's distance, then run PROC CLUSTER.

Search the following keyword at support.sas.com

categroy data cluster analysis

deega
Quartz | Level 8
@Ksharp
I might sound weird but is there a way to convert categorical data to continuous data ?
Ksharp
Super User

No way .

Ksharp
Super User

Or you could take a look this:

 

Overview: PRINQUAL Procedure
The PRINQUAL procedure performs principal component analysis (PCA) of qualitative, quantitative, or
mixed data. PROC PRINQUAL is based on the work of Kruskal and Shepard (1974); Young, Takane, and

 

 

polt two primary component at X-Y asix . and see which one belong to a cluster.
 

deega
Quartz | Level 8
Before Fastclus I tried distance and cluster but since I have large dataset (10000000 records) and I got error that it can not be used on such large data. Is there any other way ?
Rick_SAS
SAS Super FREQ

What code are you using for PROC FASTCLUS?

deega
Quartz | Level 8

Here is my code

 

proc fastclus data=std out=clus maxclusters=10;
var x--y;
run;

 

My variables are mix of categorical and continuous.

Ksharp
Super User

Cluster Analysis can't be apply to mixed data(categorical and continuous).

Babloo
Rhodochrosite | Level 12

Can we do cluster analysis only with continous varaibles?

deega
Quartz | Level 8
OK. If I convert my data to categorical as most of it is categorical then how can I cluster them, given that its large dataset.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1438 views
  • 0 likes
  • 4 in conversation