BookmarkSubscribeRSS Feed

Clustering by Example in SAS® Enterprise Miner™

Started ‎10-14-2015 by
Modified ‎11-30-2015 by
Views 2,094

 

Download the Files (GitHub)

 

This tip is part of Learn by Example with SAS® Enterprise Miner™ Templates where a new data mining topic is introduced and explained with one or more example SAS Enterprise Miner process flow diagrams.

 

The topic discussed here is clustering – a technique that uses unlabeled data (data with no target variable), also called unsupervised learning. Data is often unlabeled as it is can be expensive and/or time consuming to label it, so this technique can be widely applied.

 

So what is clustering? In its simplest form, the goal of this technique is to create clusters (aka groups or segments) of observations so that within cluster variability is minimized and between clusters variability is maximized. In the end, all observations are divided into clusters so that every observation belongs to exactly one cluster.

 

To get started with clustering using SAS Enterprise Miner, download the process flow diagrams (XML files) and the accompanying PDF documentation for the following two examples from the GitHub repository at https://github.com/sassoftware/dm-flow/tree/master/Clustering

 

1. ClusterNodeExplore: A simple example that shows how the Cluster and Segment Profile nodes can be used to explore data

ClusterNodeExplore.png

2. ClusterNodePredict: An advanced example that uses the Cluster node as part of a regression modeling flow to demonstrate one of the ways it can be used to improve the prediction accuracy of the model.ClusterNodePredict.png

 

To run these examples, refer to the README file that is part of the GitHub repository at https://github.com/sassoftware/dm-flow. Please note that these examples were tested with SAS Enterprise Miner 13.2

 

Version history
Last update:
‎11-30-2015 02:53 PM
Updated by:

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags