turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- after cluster node: mark rows with cluster number,...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-31-2014 04:42 PM

I'm a relative newbie - my questions have to do with the Cluster node in Enterprise Miner - I would like to be able to do the following:

1) after running the node I would like to create a new variable that gives the cluster number for every row - then there should be statistical procedures that will allow a fairly deep analysis by segment, instead of just the lattice graph output of the Segment Profile node

2) if we can satisfy 1) how do I go from Enterprise Miner to the statistical procedures? - an example would be, for each segment, the significance test on tables with the target variable in rows and input variable values in columns, like what Excel calls pivot tables - the test is a straightforward chi-square test and Excel has a chitest() function giving the p-value for the independence hypothesis but I would not like to always save the cluster results into an Excel file, would rather run the test in SAS - would this involve somehow switching from EM to SAS 9.3, which came with the package I got from the Hanlon center? - if so, how do I do that? - can I just save the project and then open it in 9.3? - I suppose I can try that but I just thought of this problem - have not become familiar with straight forward statistical SAS

3) in the Segment Profile node, which is run after the Cluster node on a text field in the data, how do I show statistics for each segment, like the proportion of a target variable value, e.g., Died = Yes? - I've been all through, it seems, the Segment Profile parameters and results without finding any way to do that

to me just knowing the clusters is of limited use unless information of segment statistics is also derived - the prototype problem is to segment the customer population and relate segments to the purchase of products and the effectiveness of product features, advertising, promotions, etc., somewhat like conjoint analysis on results of text mining clustering