Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Produce Cluster stats on Manually updated Clusters

Reply
Contributor
Posts: 26

Produce Cluster stats on Manually updated Clusters

[ Edited ]

Hello to All,

 

I am trying to obtain the centre points for my newly generated Clusters (as i will need it for future reporting cluster scoring), of which i have manually intervened and created additional clusters from the original PROC FASTCLUS. (7 New clusters identified under the Variable "NEWCLUSTER")

 

Now, i know if i used the original clusters i could use the OUTSTAT= option, HOWEVER as i have since split the clusters and created new ones i now require the centre points and stats for these also. How can i get the stats for the final cluster set please?

 

Outstat example

proc fastclus data=FINAL_CLUSters maxclusters=7 maxiter=100 converge=0

mean=mean out=prelimvol OUTSTAT=OUTSTAT;

var VOLUME: LOGVAL_Mean: LOGVAL_Sum:;

run;

 

I have attached an illustration of the New clusters for your reference

Contributor
Posts: 26

Re: Produce Cluster stats on Manually updated Clusters

Nobody?? i'm all alone....

 

Is there anything else i can provide to be more clear?

SAS Employee
Posts: 2

Re: Produce Cluster stats on Manually updated Clusters

A simple solution is just to use PROC MEANS on your newly assigned data set.  Use and NWAY option and a CLASS statement with NEWCLUSTER as the classification variable.

 

proc means data=OUTSTAT nway;

    class NEWCLUSTER;

    var VOLUME: LOGVAL_Mean: LOGVAL_Sum:;

    output out=CENTROIDS mean=;

run;

 

SAS Employee
Posts: 51

Re: Produce Cluster stats on Manually updated Clusters

Posted in reply to archerbum

Hello,

 

I have 3 solutions popping up in my mind:

 

1. PROC MEANS just as @archerbum describes!

2. EMiner: Give your NEWCLUSTER variable the SEGMENT role and use the SEGMENT PROFILER node in Enterprise Miner

3. PROC FASTCLUS

proc sort data=mylib.myCLUSds; by NEWCLUSTER; run;
proc fastclus data=mylib.myCLUSds 
              maxclusters=1 maxiter=0 
              outstat=work.abc outseed=work.def;
			  /* OUTSEED= or MEAN= */
 id name;
 by NEWCLUSTER;
 var _NUMERIC_;
 *var VOLUME: LOGVAL_Mean: LOGVAL_Sum: ;
run;

 

Cheers,

Koen

 

 

Ask a Question
Discussion stats
  • 3 replies
  • 260 views
  • 0 likes
  • 3 in conversation