06-19-2016 01:01 PM
I am working on Chap 5 of the book by Randy Collica : Customer Segmentation and Clustering using E Miner. The attached zip file contains: 1) Customers Dataset (100000 customer recs) 2) Word Doc with images of two distance plots 3) XML Process Flow
1) Pls start with the word doc, one distance plot does not have any circles drawn around the centroids whereas the other has the circles automatically drawn. I need to find out why are there no cricles drawn automatically in the first plot.
2) I am enclosing the XML diagram where there are two cluster nodes. One is named Copied Cluster Node (source sample code from the author's website) the other node was created by me.My Node has a problem where I need help
3) Cluster Node Property (SCORE) : Hide Original variables - how do I change the Yes to NO. In the transform node 4 variables were transformed and i want to see the orginal variables
would be very grateful for the help thanks
PS: the customers data set could not be uploaded due to limitations of file size. Please let me know if it is necessary, perhaps the file size problem can be obviated via google docs
06-20-2016 07:39 AM
06-20-2016 09:34 AM
Regarding the cluster plots, the first solution is one dimensional, that is, all of the variation between the clusters can be explained by a single latent variable. Thus circles, which are two dimensional, aren't needed to describe the within-cluster variation as there is little or no variance in dimension 2. (Caveat: I haven't looked at the code. This is my just my best guess as to what's going on here. )
06-20-2016 12:38 PM
06-20-2016 01:19 PM - edited 06-20-2016 01:25 PM
Regarding the original variables, you should be able to see them (along with your cluster scores) if you select your Cluster node then press the Exported Data button in node properties. You don't need to do anything special.
The Hide Original Variables option enables only when Scoring Imputation is in effect (i.e., Scoring Imputation Method = Seed of Nearest Cluster). In that case, the original untransformed variables will hide by default. If you have chosen to impute, then NO will include two sets of variables (original, transformed) when you look at the exported data.
Does that make sense?