turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Data Mining
- /
- Clustering nominal variables in Enterprise miner 1...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-05-2017 09:25 AM

Hi,

I have a data set with over 30 attributes, mostly nominal, 3 ordinal, 3 binary variables, and just** one interval** variable (age).

Could anyone please advise if I can successfully do a clustering on this data set using Enterprise Miner 14.2?

- Does the Cluster Node support the clustering of data with nominal variables?
- Does the Cluster Node support the clustering of data with
**both numeric and nominal variables**? - Can I use the HP Cluster node instead (on a single-machine mode)? How does this differ from using the normal cluster node (on a single-machine mode)?
- If EM does not support clustering nominal variables, should they be re-coded using surrogate keys?

Any help would be really appreciated.

Kind Regards

RJ

Accepted Solutions

Solution

04-08-2017
05:09 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ribbonjovi

04-07-2017 01:37 PM

EM Cluster Node does support nominal variables. The variables need to be identified as input in the data source. Use a Segment Profile node found under the Access tab to explore the relationships. For interval variables the Red respresents the population values, the blue represents the segment. For nominal the inner pie represents the population and the outer represents the segment.

HP Cluster node currently only runs on interval variables. You would need to change your nominal variables to indicator variables (0,1) to use in HP Cluster.

You can use either node on a single-machine mode. Look in the help to see the different settings/options available in both. Help can be found by clicking on the blue book with a ? icon and then node reference. Each node's help is organized under the tab name. For example HP Cluster is under the HPDM Nodes section.

All Replies

Solution

04-08-2017
05:09 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Ribbonjovi

04-07-2017 01:37 PM

EM Cluster Node does support nominal variables. The variables need to be identified as input in the data source. Use a Segment Profile node found under the Access tab to explore the relationships. For interval variables the Red respresents the population values, the blue represents the segment. For nominal the inner pie represents the population and the outer represents the segment.

HP Cluster node currently only runs on interval variables. You would need to change your nominal variables to indicator variables (0,1) to use in HP Cluster.

You can use either node on a single-machine mode. Look in the help to see the different settings/options available in both. Help can be found by clicking on the blue book with a ? icon and then node reference. Each node's help is organized under the tab name. For example HP Cluster is under the HPDM Nodes section.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to MelodieRush

04-08-2017 05:27 AM

Thank You Melodie, your response was very helpful.

Maybe you can help me out with these ones too!

- Is there a node in Enterprise miner that helps convert nominal variables (especially for the ones with high cardinality) to indicator variables (0,1)?
- In the segment profile below do you know if there is an option of labeling (or customizing) all the segment inputs simultaneously (I thought it's laborious to double click each input and editing their graph properties individually)?

Really appreciate your help.

Cheers

Rj