- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I would like to analyse my dataset by groups in the SAS Enterprise Miner. I would like to do separately in every stratum of my dataset (identified by a segmenting variable) and at the same time for every target variable that I have (I have many binary target variables in my dataset). In other words I want to build separate models for every target variable (out of many) in every stratum in one flow.
SAS EM won’t let me use two or more nested Start/END Group Processing in one flow (I tried one start group node set to “Stratify” and followed by another set to “Target” in the Mode property + two END Group Nodes at the end; screenshot below)
Could you provide a hint how to do it properly.
Thanks,
Darek
My dataset looks like this:
segment | target1 | target2 | target3 | input1 | input2 | input3 |
1 | 0 | 1 | 1 | 3.546002385 | 0.500653822 | 2.000653822 |
1 | 0 | 0 | 1 | 3.252634764 | 0.000667064 | 1.500667064 |
1 | 0 | 0 | 0 | 3.515556887 | 0.000404071 | 0.500404071 |
2 | 1 | 0 | 1 | 3.238808053 | 0.000407713 | 0.500407713 |
2 | 0 | 0 | 1 | 3.986324948 | 0.000460164 | 0.500460164 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If I understand your scenario correctly, a single Group Processing node should be sufficient to do what you are asking. I will use a simple example that uses one of our demo data set (SAMPSIO.HMEQ) to explain:
Consider the simple diagram/flow
The metadata in the Variables tab of the IDS node contains 2 target variables: BAD and VALUE and one segment variable REASON.
In the Start Groups node, set the Mode property, under the General Group, to "Stratify" and the Target Group property to "Yes". This indicates to the Start Groups node that it should loop over all target/group combinations. A segment variable is considered, by default, to be a stratification variable.
When you run from the End Groups node (or one of its successors in the flow), the Decision tree node will run in this example, 6 times, because there are two targets, BAD and VALUE, specified and the segment variable has 3 strata (blank, DebtCon, HomeImp). You can see this when the flow has completed and you examine the results of the End Groups node. It contains among others:
- a Summary report of each loop, which show that indeed every target was modeled for each stratum
- Under the View menu, you can access various results for each loop and each target
Hope this helps,
Dom