Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

SAS EM 12.3 Cluster Node - Cluster Distance Plot

Reply
Occasional Contributor
Posts: 10

SAS EM 12.3 Cluster Node - Cluster Distance Plot

I am working on Chap 5 of the book by Randy Collica : Customer Segmentation and Clustering using E Miner. The attached zip file contains: 1) Customers Dataset (100000 customer recs) 2) Word Doc with images of two distance plots 3) XML Process Flow

 

1) Pls start with the word doc, one distance plot does not have any circles drawn around the centroids whereas the other has the circles automatically drawn. I need to find out why are there no cricles drawn automatically  in the first plot.

 

2) I am enclosing the XML diagram where there are two cluster nodes. One is named Copied Cluster Node (source sample code from the author's website) the other node was created by me.My Node has a problem where I need help

 

3) Cluster Node Property (SCORE) : Hide Original variables - how do I change the Yes to NO. In the transform node 4 variables were transformed and i want to see the orginal variables

 

 

would be very grateful for the help thanks

 

PS: the customers data set could not be uploaded due to limitations of file size. Please let me know if it is necessary, perhaps the file size problem can be obviated via google docs

Attachment
Occasional Contributor
Posts: 10

Re: SAS EM 12.3 Cluster Node - Cluster Distance Plot

update on my previous mail dated Jun 19th 2016

 

a trucated version of the customer file is now uploaded. Original customer dataset  had 100000 customer recs, now the dataset customers2 has 70000 recs 

 

trust this helps

 

thanks and best regards

Attachment
SAS Employee
Posts: 106

Re: SAS EM 12.3 Cluster Node - Cluster Distance Plot

Regarding the cluster plots, the first solution is one dimensional, that is, all of the variation between the clusters can be explained by a single latent variable. Thus circles, which are two dimensional, aren't needed to describe the within-cluster variation as there is little or no variance in dimension 2. (Caveat: I haven't looked at the code. This is my just my best guess as to what's going on here. )

Occasional Contributor
Posts: 10

Re: SAS EM 12.3 Cluster Node - Cluster Distance Plot

thanks Ray, let me look at the system again tomorrow morning and come back to you. You have given me line to investigate further . 
Randy Collica's solution, (for the same dataset and as per his XML diagram)  the cluster node generates a solution  where two variables together differentiate the clusters . In the XML attachment I copy pasted the cluster node from the text book solution and I get a cluster plot with circles. The data source node, transform node and the filter nodes are all common.
I am also struggling to set Hide Original Variables to NO, the default value is YES.
thanks once again for the support.
SAS Employee
Posts: 106

Re: SAS EM 12.3 Cluster Node - Cluster Distance Plot

[ Edited ]

Hi.

 

Regarding the original variables, you should be able to see them (along with your cluster scores) if you select your Cluster node then press the Exported Data button in node properties.  You don't need to do anything special. 

 

The Hide Original Variables option enables only when Scoring Imputation is in effect (i.e., Scoring Imputation Method = Seed of Nearest Cluster). In that case, the original untransformed variables will hide by default. If you have chosen to impute, then NO will include two sets of variables (original, transformed) when you look at the exported data. 

 

Does that make sense? 

 

Ray

Ask a Question
Discussion stats
  • 4 replies
  • 308 views
  • 0 likes
  • 2 in conversation