BookmarkSubscribeRSS Feed
ertr
Quartz | Level 8

Hi all,

 

I know there are several context about this subject in internet, however, when I try to get %50/%50 bad rate sample by using SAS Code or by using Sample node on Enterprise Miner, I could not reach my aim. I try to get stratified sample  based on Target and Date variables. I found some code on internet and I also tried Sample node but I could not get 50/50 sample, I don't exactly know what values should I select when I use the Enterprise Miner Sample Node on Properties panel.

 

I used following propeties when I try to get the sample on Enterprise Miner;

 

Desired.png

 

On the other hand, if there is code to get under sample  of data based on bad rate, I would like to learn the method to get sample by using Enterprise Guide.

 

I have a sample data set as below, I want to get  12(1)/12(0) sample based on Target and Date variables, if someone can help me, I will be glad to learn these methods.

 

Data Have;
Length ID 8 Date $ 20 Variable1 8 Variable2 8 Variable3 8 Target 8;
Infile Datalines Missover ;
Input ID Date Variable1 Variable2 Variable3 Target;
Datalines;
1 20150101 100 200 300 0
1 20150201 100 200 300 1
1 20150301 100 200 300 0
2 20150101 100 200 300 1
2 20150201 100 200 300 0
2 20150301 100 200 300 0
3 20150101 100 200 300 0
3 20150201 100 200 300 0
3 20150301 100 200 300 1
4 20150101 100 200 300 0
4 20150201 100 200 300 0
4 20150301 100 200 300 1
5 20150101 100 200 300 1
5 20150201 100 200 300 0
5 20150301 100 200 300 0
6 20150101 100 200 300 0
6 20150201 100 200 300 1
6 20150301 100 200 300 0
7 20150101 100 200 300 1
7 20150201 100 200 300 0
7 20150301 100 200 300 0
8 20150101 100 200 300 0
8 20150201 100 200 300 1
8 20150301 100 200 300 0
9 20150101 100 200 300 0
9 20150201 100 200 300 0
9 20150301 100 200 300 1
10 20150101 100 200 300 0
10 20150201 100 200 300 0
10 20150301 100 200 300 1
11 20150101 100 200 300 1
11 20150201 100 200 300 0
11 20150301 100 200 300 0
12 20150101 100 200 300 0
12 20150201 100 200 300 1
12 20150301 100 200 300 0
;
Run;

Thank you,

4 REPLIES 4
PGStats
Opal | Level 21

Use proc surveyselect

 

proc sort data=have; by date target; run;

proc surveyselect data=have out=samples sampsize=12;
strata date target;
id id;
run;

If you want to oversample, i.e. get a sample size greater than the population, then do:

 

proc sort data=have; by date target; run;

proc surveyselect data=have out=samples sampsize=12 method=urs outhits;
strata date target;
id id;
run;

 

PG
ertr
Quartz | Level 8

Thank you,

 

Your first code gives following error;

 

ERROR: The sample size, 12, is greater than the number of sampling units, 8.
ERROR: The sample size, 12, is greater than the number of sampling units, 4.

 

And I don't exactly understand what your second code gives us and what Method=URS&Outhits do? Could you give more detail, please? I want to get a code which export 24 rows being 12 bad and 12 good based on Target&Date.

 

And on Enterprise Miner, what should I do, to get following results?

 

MinerOut.png

ertr
Quartz | Level 8

Any suggestion about the subject?

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 4144 views
  • 0 likes
  • 2 in conversation