Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Doubts about oversampling

Reply
N/A
Posts: 0

Doubts about oversampling

I am trying to predict a rare event, I read about using oversampling with the sampling node both on the following link and on EM's Help.
http://support.sas.com/kb/24/205.html

The link says that I'm not supposed to adjust frecuency for oversampling but EM's help says I should. My intention is to make a model and then score a large database with the resulting model, Should I adjust the frecuency for oversampling or not?

I tried both approaches, the cumulative lift and even some of the resulting independent variables are very different.
SAS Employee
Posts: 5

Re: Doubts about oversampling

Hi,

what Enterprise Miner version are you using? In Enterprise Miner 5.x, do not select the "adjust frequency for oversampling" check box as it offsets the level-based sampling / over-sampling. To my mind, you can either use the level-based sampling approach to over-sampling OR the adjust frequency approach to over-sampling. I use diagrams like this one in EM 5.3:

Input Data Source (_without_ a target profile)
>
Sample Node (with level-based sampling, no frequency adjustment)
>
Decision node (create an appropriate target profile to reflect the true priors)
>
[...]

Cheers,
Karsten
Ask a Question
Discussion stats
  • 1 reply
  • 265 views
  • 0 likes
  • 2 in conversation