BookmarkSubscribeRSS Feed
DarekB
Calcite | Level 5

Hello,

I would like to analyse my dataset by groups in the SAS Enterprise Miner. I would like to do separately in every stratum of my dataset  (identified by a segmenting variable) and at the same time for every target variable that I have (I have many binary target variables in my dataset). In other words I want to build separate models for every target variable (out of many) in every stratum in one flow.

SAS EM won’t let me use two or more nested Start/END Group Processing in one flow (I tried one start group node set to “Stratify” and followed by another set to “Target” in the Mode property  + two END Group Nodes at the end; screenshot below)

Could you provide a hint how to do it properly.

Thanks,

Darek

 

DarekB_0-1678395003731.png

 

My dataset looks like this:

 

segment

target1

target2

target3

input1

input2

input3

1

0

1

1

3.546002385

0.500653822

2.000653822

1

0

0

1

3.252634764

0.000667064

1.500667064

1

0

0

0

3.515556887

0.000404071

0.500404071

2

1

0

1

3.238808053

0.000407713

0.500407713

2

0

0

1

3.986324948

0.000460164

0.500460164

1 REPLY 1
Dom_Latour
SAS Employee

If I understand your scenario correctly, a single Group Processing node should be sufficient to do what you are asking.  I will use a simple example that uses one of our demo data set (SAMPSIO.HMEQ)  to explain: 

 

Consider the simple diagram/flow

Dom_Latour_2-1678972951665.png

 

The metadata in the Variables tab of the IDS node contains 2 target variables: BAD and VALUE and one segment variable REASON.  

 

Dom_Latour_6-1678974161186.png

 

In the Start Groups node, set the Mode property, under the General Group, to "Stratify" and the Target Group property to "Yes".  This indicates to the Start Groups node that it should loop over all target/group combinations.  A segment variable is considered, by default, to be a stratification variable.

 

Dom_Latour_3-1678973188744.png 

 

When you run from the End Groups node (or one of its successors in the flow), the Decision tree node will run in this example, 6 times, because there are two targets, BAD and VALUE, specified and the segment variable has 3 strata (blank, DebtCon, HomeImp).  You can see this when the flow has completed and you examine the results of the End Groups node.  It contains among others:

  • a Summary report of each loop, which show that indeed every target was modeled for each stratum

Dom_Latour_7-1678974365593.png

 

  • Under the View menu, you can access various results for each loop and each target

Dom_Latour_5-1678973640328.png

 

Hope this helps,

Dom

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 990 views
  • 0 likes
  • 2 in conversation