BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
frupaul
Quartz | Level 8

Hi everyone,

Is there a way of automatically collapsing levels of a categorical variable in SAS Enterprise miner (I dont want to do it using the replacement editor as this is a manual approach).

 

One way of doiing this in SAS enterprise guide is to use the greenacre's method. This collapses levels that lead to the least reduction in the chis square statistics, thereby leading to a resulting categorical variable that has a strong relationship to the target.

 

Could greenacre's method be performed in SAS Miner?

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
TWoodfield
SAS Employee

The attached code contains a SAS program that can be implemented in a SAS Code node in SAS Enterprise Miner. Use the code at your own risk. It represents an attempt to implement Greenacre's method to consolidate the levels of a categorical variable. You may also wish to consider using decision trees as described in Section 9.4 of the course, "Applied Analytics Using SAS Enterprise Miner."

View solution in original post

3 REPLIES 3
TWoodfield
SAS Employee

The attached code contains a SAS program that can be implemented in a SAS Code node in SAS Enterprise Miner. Use the code at your own risk. It represents an attempt to implement Greenacre's method to consolidate the levels of a categorical variable. You may also wish to consider using decision trees as described in Section 9.4 of the course, "Applied Analytics Using SAS Enterprise Miner."

frupaul
Quartz | Level 8

Hi TWoodfield,

Thanks for taking your time to write the code out. I am even more curious as to why I couldn't use a score code in the SAS CODE node in Miner. I'll check out the course your recommended, but for the meantime do you have any suggestions as to why the new variable I created did'nt show up in proceeding nodes?

 

In section 9.4 of 'Applied Analytics  Using Enterprise Miner', the tutor shows how to consolidate levels of a single variable using a decision tree. From that point, do I need to use the replacement editor to manually create the new levels based on the output of the decision tree? Or is there a way of applying those newly created levels automatically by maybe linking the node to another node? I did my predictive modeller certification a year ago but that was not in the content.

Thanks,

 

Paul

TWoodfield
SAS Employee

The code I provided is a start to writing a complete extension node for SAS Enterprise Miner. You can modify the metadata to make the new variable an Input variable. The program actually creates score code, it is the code stored in the location identified by the EM_FLOW_EMFLOWSCORECODE macro variable. You can learn more about extension nodes by downloading "SAS Enterprise Miner Extension Nodes: Developer's Guide." Search on "SAS Extension Nodes Developer's Guide." The latest path seems to be

 

https://support.sas.com/documentation/cdl/en/emxndg/67980/PDF/default/emxndg.pdf

 

You can also learn more about extension nodes by taking the course "Extending SAS Enterprise Miner with User-Written Nodes." The documentation or the course will teach you about flow score code and publish score code.

 

If you consolidate levels using a Decision Tree node as described in the EM course, you change the leaf property to input, then the variable _NODE_ is exported as an input for subsequent nodes.

 

If done correctly, you do not need a Replacement node.

 

Note that the input variable created using my code is automatically given the name of the original variable with suffix "_clus" added. You can obviously change the code to do anything you like.

 

Regards,

Terry

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 2972 views
  • 0 likes
  • 2 in conversation