Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Multiple equal sized samples for Modeling

Frequent Contributor
Posts: 126

Multiple equal sized samples for Modeling

Hi all,

I am trying in EM to create equal sized samples of events and non events in order to perform modeling later on.

So i want all the events to be present in all the derived samples, but the non events should be changing every time i create a new sample.

In SAS Base i would perform a loop and get my requested samples eg,500 containing always the events but with different non events.

How is that possible in EM?

Thank you in advance

SAS Employee
Posts: 68

Re: Multiple equal sized samples for Modeling

Posted in reply to chemicalab

Hi there are a few options with respect to oversampling inside EM.  The most straightforward approach is to use the sample node.

My thought is to use the sample node, use 100% for percentage, and Criterion=Level to keep an equal number of non-events.  You can change the seed to get different non-events each time.  Depending on how many events you wish to keep, you can change the options from within the Level/Stratify properties.

To run this through an iterative process, you could use the Start and End Group nodes, see flow below:

IDS - Start Group (Mode=Index) - Sample - Code node (to aggregate the samples) - End Group

For more advanced processing, you can contact SAS Tech Support, who are excellent resources (part of your license!) to work with.



Frequent Contributor
Posts: 126

Re: Multiple equal sized samples for Modeling

Hi there thank you for your input, i had tried that in EM but it didnt work as expected, its strange that it hasnt been incorporated in order to loop among equal sized samples ,perform the modeling and get the comparison as it does normally when comparing two different modeling techniques, I can do it in SAS base but i wanted try it out in EM.

I will take you up on the advice and contact the tech support as well

Thanx again for the reply

Ask a Question
Discussion stats
  • 2 replies
  • 2 in conversation