BookmarkSubscribeRSS Feed
ertr
Quartz | Level 8

Hi all,

 

I know there are several context about this subject in internet, however, when I try to get %50/%50 bad rate sample by using SAS Code or by using Sample node on Enterprise Miner, I could not reach my aim. I try to get stratified sample  based on Target and Date variables. I found some code on internet and I also tried Sample node but I could not get 50/50 sample, I don't exactly know what values should I select when I use the Enterprise Miner Sample Node on Properties panel.

 

I used following propeties when I try to get the sample on Enterprise Miner;

 

Desired.png

 

On the other hand, if there is code to get under sample  of data based on bad rate, I would like to learn the method to get sample by using Enterprise Guide.

 

I have a sample data set as below, I want to get  12(1)/12(0) sample based on Target and Date variables, if someone can help me, I will be glad to learn these methods.

 

Data Have;
Length ID 8 Date $ 20 Variable1 8 Variable2 8 Variable3 8 Target 8;
Infile Datalines Missover ;
Input ID Date Variable1 Variable2 Variable3 Target;
Datalines;
1 20150101 100 200 300 0
1 20150201 100 200 300 1
1 20150301 100 200 300 0
2 20150101 100 200 300 1
2 20150201 100 200 300 0
2 20150301 100 200 300 0
3 20150101 100 200 300 0
3 20150201 100 200 300 0
3 20150301 100 200 300 1
4 20150101 100 200 300 0
4 20150201 100 200 300 0
4 20150301 100 200 300 1
5 20150101 100 200 300 1
5 20150201 100 200 300 0
5 20150301 100 200 300 0
6 20150101 100 200 300 0
6 20150201 100 200 300 1
6 20150301 100 200 300 0
7 20150101 100 200 300 1
7 20150201 100 200 300 0
7 20150301 100 200 300 0
8 20150101 100 200 300 0
8 20150201 100 200 300 1
8 20150301 100 200 300 0
9 20150101 100 200 300 0
9 20150201 100 200 300 0
9 20150301 100 200 300 1
10 20150101 100 200 300 0
10 20150201 100 200 300 0
10 20150301 100 200 300 1
11 20150101 100 200 300 1
11 20150201 100 200 300 0
11 20150301 100 200 300 0
12 20150101 100 200 300 0
12 20150201 100 200 300 1
12 20150301 100 200 300 0
;
Run;

Thank you,

4 REPLIES 4
PGStats
Opal | Level 21

Use proc surveyselect

 

proc sort data=have; by date target; run;

proc surveyselect data=have out=samples sampsize=12;
strata date target;
id id;
run;

If you want to oversample, i.e. get a sample size greater than the population, then do:

 

proc sort data=have; by date target; run;

proc surveyselect data=have out=samples sampsize=12 method=urs outhits;
strata date target;
id id;
run;

 

PG
ertr
Quartz | Level 8

Thank you,

 

Your first code gives following error;

 

ERROR: The sample size, 12, is greater than the number of sampling units, 8.
ERROR: The sample size, 12, is greater than the number of sampling units, 4.

 

And I don't exactly understand what your second code gives us and what Method=URS&Outhits do? Could you give more detail, please? I want to get a code which export 24 rows being 12 bad and 12 good based on Target&Date.

 

And on Enterprise Miner, what should I do, to get following results?

 

MinerOut.png

ertr
Quartz | Level 8

Any suggestion about the subject?

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 4057 views
  • 0 likes
  • 2 in conversation