Hello,
I would like to analyse my dataset by groups in the SAS Enterprise Miner. I would like to do separately in every stratum of my dataset (identified by a segmenting variable) and at the same time for every target variable that I have (I have many binary target variables in my dataset). In other words I want to build separate models for every target variable (out of many) in every stratum in one flow.
SAS EM won’t let me use two or more nested Start/END Group Processing in one flow (I tried one start group node set to “Stratify” and followed by another set to “Target” in the Mode property + two END Group Nodes at the end; screenshot below)
Could you provide a hint how to do it properly.
Thanks,
Darek
My dataset looks like this:
segment | target1 | target2 | target3 | input1 | input2 | input3 |
1 | 0 | 1 | 1 | 3.546002385 | 0.500653822 | 2.000653822 |
1 | 0 | 0 | 1 | 3.252634764 | 0.000667064 | 1.500667064 |
1 | 0 | 0 | 0 | 3.515556887 | 0.000404071 | 0.500404071 |
2 | 1 | 0 | 1 | 3.238808053 | 0.000407713 | 0.500407713 |
2 | 0 | 0 | 1 | 3.986324948 | 0.000460164 | 0.500460164 |
If I understand your scenario correctly, a single Group Processing node should be sufficient to do what you are asking. I will use a simple example that uses one of our demo data set (SAMPSIO.HMEQ) to explain:
Consider the simple diagram/flow
The metadata in the Variables tab of the IDS node contains 2 target variables: BAD and VALUE and one segment variable REASON.
In the Start Groups node, set the Mode property, under the General Group, to "Stratify" and the Target Group property to "Yes". This indicates to the Start Groups node that it should loop over all target/group combinations. A segment variable is considered, by default, to be a stratification variable.
When you run from the End Groups node (or one of its successors in the flow), the Decision tree node will run in this example, 6 times, because there are two targets, BAD and VALUE, specified and the segment variable has 3 strata (blank, DebtCon, HomeImp). You can see this when the flow has completed and you examine the results of the End Groups node. It contains among others:
Hope this helps,
Dom
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.