Hi
On Enterprise miner I ran the Cluster node on a large dataset for which I got 6 cluster segments. I am now intending to further cluster each of these segments.
My question is: Is there a way to continue the flow from the first cluster node and run clustering on each of the resulting segments without having to export data and run them as separate data sources?
Can I use the Start Groups and End Groups Node for this?
Thanks for your help
Roby
Roby,
I would avoid using Group Processing since you will not have the ability to look as deeply into each by-group as you might desire. Given you only have 6 groups that you wish to dig further into, you might consider creating a separate path forward through a Filter node selecting the desired subset to do further analysis. In many cases, subsequent clustering of all 6 groups is likely not necessary since there is often one or more large cluster with a number of relatively small clusters as well. It is likely helpful to consider breaking up the larger clusters and you can do so as follows:
One way to create this modified target is to use a Filter node as follows:
1) Make sure that all prior nodes ran successfully.
2) Add a Filter node (from the Sample Tab) for each desired subgroup.
3) For each Filter node, proceed as follows:
(a) Click on the desired Filter node to make it active
(b) Click on the ... to the right of Class Variables in the Filter node properties.
4) Select the Generate Summary button in the Interactive Class Filter dialog.
5) Select the _SEGMENT_ variable see the histogram of segment levels.
6) Click the histogram bars for the category that you want to exclude.
7) Click Apply Filter.
😎 Click Ok.
9) Run the Filter node and view the results to confirm you have the kept the desired observations.
10) Add a subsequent cluster node to cluster on the subset of observations.
Note: The default behavior of the Filter node is to export those levels that have not been selected. You can also choose to have the Filter node export on the selected histogram bars. This functionality is helpful when you only want a few levels out of many. In this case, you would select the histogram bars that you want to keep in step 6 above. In this case, you would need to make sure you add change the Export Table property from Filtered to Excluded in the Filter node properties prior to running the Filter node.
Hope this helps!
Doug
Roby,
I would avoid using Group Processing since you will not have the ability to look as deeply into each by-group as you might desire. Given you only have 6 groups that you wish to dig further into, you might consider creating a separate path forward through a Filter node selecting the desired subset to do further analysis. In many cases, subsequent clustering of all 6 groups is likely not necessary since there is often one or more large cluster with a number of relatively small clusters as well. It is likely helpful to consider breaking up the larger clusters and you can do so as follows:
One way to create this modified target is to use a Filter node as follows:
1) Make sure that all prior nodes ran successfully.
2) Add a Filter node (from the Sample Tab) for each desired subgroup.
3) For each Filter node, proceed as follows:
(a) Click on the desired Filter node to make it active
(b) Click on the ... to the right of Class Variables in the Filter node properties.
4) Select the Generate Summary button in the Interactive Class Filter dialog.
5) Select the _SEGMENT_ variable see the histogram of segment levels.
6) Click the histogram bars for the category that you want to exclude.
7) Click Apply Filter.
😎 Click Ok.
9) Run the Filter node and view the results to confirm you have the kept the desired observations.
10) Add a subsequent cluster node to cluster on the subset of observations.
Note: The default behavior of the Filter node is to export those levels that have not been selected. You can also choose to have the Filter node export on the selected histogram bars. This functionality is helpful when you only want a few levels out of many. In this case, you would select the histogram bars that you want to keep in step 6 above. In this case, you would need to make sure you add change the Export Table property from Filtered to Excluded in the Filter node properties prior to running the Filter node.
Hope this helps!
Doug
Surely makes sense Doug.
Thanks
Roby
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.