Sampling

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

Sampling

I am working on a predictive model with a sample size of about 430K, of which the target is binary.  The rare event (1) has about 1000 samples. I need to come up with a way so that the ratio is about 50%. Currently I only have access to SAS Enterprise Guide. 

 

What are some recommendations on how to handle this with sample code?

 

Thanks

 


Accepted Solutions
Solution
‎12-04-2017 09:51 AM
Super User
Posts: 10,698

Re: Sampling

You want oversample as ratio 1:1 ?

 

data class;
 set sashelp.class;
run;
proc sort data=class;
by sex;
run;
proc surveyselect data=class out=want sampsize=(5 5) seed=12345678;
strata sex;
run;

View solution in original post


All Replies
Super User
Posts: 23,343

Re: Sampling

[ Edited ]

Which Task/Proc are you using? I'm assuming Logistic regression but depends partially on your variables.

 

If you define your model/proc you can find full examples in the SAS documentation for that procedure.

Esteemed Advisor
Posts: 5,483

Re: Sampling

Use proc surveyselect, stratify by your target variable, and request sampling rates of 1 and 1/430 for your target and non target strata, respectively.

PG
Super User
Posts: 13,358

Re: Sampling


PGStats wrote:

Use proc surveyselect, stratify by your target variable, and request sampling rates of 1 and 1/430 for your target and non target strata, respectively.


 

Or instead of sample rate you can specify an exact sample size. If you request 100 from reach strata your sample will be half of one and half of the other.

It is helpful to know that Surveyselect will provide both the selection probability and the sampling weight for each record.

Solution
‎12-04-2017 09:51 AM
Super User
Posts: 10,698

Re: Sampling

You want oversample as ratio 1:1 ?

 

data class;
 set sashelp.class;
run;
proc sort data=class;
by sex;
run;
proc surveyselect data=class out=want sampsize=(5 5) seed=12345678;
strata sex;
run;
Occasional Contributor
Posts: 7

Re: Sampling

I do want to oversample as a 1:1 ratio.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 256 views
  • 0 likes
  • 5 in conversation