BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BrianLoe
Fluorite | Level 6

PROC REG allows one to specify a BY variable generating a regression model and coefficients for each value of the BY-variable.

 

How does one achieve the same result in Enterprise Miner?

1 ACCEPTED SOLUTION

Accepted Solutions
MikeStockstill
SAS Employee

Hello BrianLoe-

 

In Enterprise Miner, you can use the Start Groups / End Groups node pair to perform by-group processing.  Set the Mode property to Stratify.  For details from within Enterprise Miner, select Help -> Contents -> Node Reference -> Utility Nodes -> Start Groups Node.

 

Not all modeling nodes produce the same output with group processing that they produce without group processing.  In the case of Start Groups -> Regression -> End Groups, you still need to perform some coding in order to see the parameter estimates at each iteration.  One method is to add a SAS Code node after the End Groups node.  Select the Code Editor property, and enter code like this (it assumes that this is a new diagram that contains only one Regression node):

 

     proc print data=&EM_LIB..reg_effects_loop;
    run;

 

Close the window, run the node, view the results.  The PROC PRINT output shows the coefficient value for each variable at each level of the BY group.  

 

 

There is an alternative approach that involves no coding.  The alternative works well if your BY variable has only a handful of levels.  With this approach, use one Filter node and one Regression node for every level of the BY variable.  In each level, use the Filter node to filter out the unwanted levels, and you get the usual Regression node results.  You can copy & paste the Filter / Regression pair, and manually modify each Filter node.  I.e., if your BY group has 5 levels, then you will have 5 Filter / Regression pairs that run in parallel.  Connect each Regression node to a single empty SAS Code node so that you can run everything from that single SAS Code node, if you want.

 

 

Which approach to consider depends on the overall goal of your flow.

 

 

If you have a very large number of BY variable levels, then you might want to consider using SAS Factory Miner, a product that is designed for analyzing data that contains a large number of segments (BY variable levels).

 

Thank you for your interest.

 

 

View solution in original post

3 REPLIES 3
MikeStockstill
SAS Employee

Hello BrianLoe-

 

In Enterprise Miner, you can use the Start Groups / End Groups node pair to perform by-group processing.  Set the Mode property to Stratify.  For details from within Enterprise Miner, select Help -> Contents -> Node Reference -> Utility Nodes -> Start Groups Node.

 

Not all modeling nodes produce the same output with group processing that they produce without group processing.  In the case of Start Groups -> Regression -> End Groups, you still need to perform some coding in order to see the parameter estimates at each iteration.  One method is to add a SAS Code node after the End Groups node.  Select the Code Editor property, and enter code like this (it assumes that this is a new diagram that contains only one Regression node):

 

     proc print data=&EM_LIB..reg_effects_loop;
    run;

 

Close the window, run the node, view the results.  The PROC PRINT output shows the coefficient value for each variable at each level of the BY group.  

 

 

There is an alternative approach that involves no coding.  The alternative works well if your BY variable has only a handful of levels.  With this approach, use one Filter node and one Regression node for every level of the BY variable.  In each level, use the Filter node to filter out the unwanted levels, and you get the usual Regression node results.  You can copy & paste the Filter / Regression pair, and manually modify each Filter node.  I.e., if your BY group has 5 levels, then you will have 5 Filter / Regression pairs that run in parallel.  Connect each Regression node to a single empty SAS Code node so that you can run everything from that single SAS Code node, if you want.

 

 

Which approach to consider depends on the overall goal of your flow.

 

 

If you have a very large number of BY variable levels, then you might want to consider using SAS Factory Miner, a product that is designed for analyzing data that contains a large number of segments (BY variable levels).

 

Thank you for your interest.

 

 

BrianRexing
Calcite | Level 5

Would SAS ever consider just allowing the user to define the variable role to be "BY" in the model variable editor?  Seems like that would be a way easier user experience.  Defining BY variables is so easy in SAS EG, but quite cumbersome in SAS EM.

 

I'm reading the start/end group documentation now, still haven't quite figured it out...

MikeStockstill
SAS Employee

Hello BrianRexing -

 

BY-variable processing (using a BY statement on a procedure) is a special case of group processing, whereas Enterprise Miner has additional methods of group processing available.  For what you want, take these steps:

 

 - Add a Start Groups node at the point where you want the group processing to begin.

 - Click the Variables property (or right-click the node and select Edit Variables).

 - In the Variables window, change the Grouping Role value to Stratification for the variable

    that you want to use to define your groups (your BY variable).  You can have more than one.

   Close the Variables window.

 - Change the Start Groups Mode property to Stratify.  

   Use the Stratify mode to perform standard group processing. When you use the Stratify mode, the Start

   Groups node loops through each level of group variable when you run the process flow diagram. When

   you select the Stratify mode, the Minimum Group Size and Target Group properties are enabled.

 - Add the nodes that you want to process repeatedly.

 - Add an End Groups node to close the loop.

 

For an example, see Help -> Contents -> Node Reference -> Utility Nodes -> Start Groups Node -> Start Groups Node Example.

 

For details about all of the group processing modes that are available, see the Start Groups Node Train Properties: General section of that same chapter.

 

Have a nice weekend.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1217 views
  • 1 like
  • 3 in conversation