Hi, I want to generate 100 dataset with 25 random numbers which are collected from origin dataset by survey functions by loop.
Here is the Origin dataset:
1001 | 0 | 1740 | 971 | 9999 |
1002 | 0 | 2600 | 260 | 9999 |
1003 | 0 | 4687 | 970 | 9999 |
1004 | 0 | 3172 | 532 | 9999 |
1005 | 0 | 1200 | 67 | 9999 |
1006 | 0 | 4278 | 524 | 9999 |
1007 | 0 | 4414 | 68 | 9999 |
1008 | 0 | 4829 | 298 | 9999 |
1009 | 0 | 2091 | 690 | 9999 |
1010 | 0 | 4908 | 227 | 9999 |
1011 | 0 | 3753 | 413 | 9999 |
1012 | 0 | 3235 | 288 | 9999 |
1013 | 0 | 2904 | 845 | 9999 |
1014 | 0 | 3539 | 591 | 9999 |
1015 | 0 | 3331 | 378 | 9999 |
1016 | 0 | 3914 | 507 | 9999 |
1017 | 0 | 4725 | 930 | 9999 |
1018 | 0 | 3359 | 298 | 9999 |
1019 | 0 | 2565 | 473 | 9999 |
1020 | 0 | 3719 | 169 | 9999 |
1021 | 0 | 1667 | 872 | 9999 |
1022 | 0 | 2196 | 935 | 9999 |
1023 | 0 | 4602 | 569 | 9999 |
1024 | 0 | 1199 | 136 | 9999 |
1025 | 0 | 3046 | 434 | 9999 |
1026 | 0 | 1705 | 666 | 9999 |
1027 | 0 | 2620 | 125 | 9999 |
1028 | 0 | 2814 | 200 | 9999 |
1029 | 0 | 3300 | 739 | 9999 |
1030 | 0 | 2760 | 50 | 9999 |
1031 | 0 | 3090 | 344 | 9999 |
1032 | 0 | 1091 | 713 | 9999 |
1033 | 0 | 4749 | 446 | 9999 |
1034 | 0 | 4788 | 713 | 9999 |
1035 | 0 | 1414 | 176 | 9999 |
1036 | 0 | 2076 | 615 | 9999 |
1037 | 0 | 2683 | 72 | 9999 |
1038 | 0 | 2434 | 712 | 9999 |
1039 | 0 | 1760 | 148 | 9999 |
1040 | 0 | 3248 | 271 | 9999 |
Here is the code of select 1 dataset from the origin dataset:
proc surveyselect data=ORD
method=pps sampsize=25 out = randomsurvey_PPS;
size b ;
run;
Is there a function to generate 100 datasets with survey function?
I suppose the code structure should be :
Do i = 1 to 100;
proc surveryselect data = ORD
method = pps
out = dataout i ?
run;
Thanks for your help!
Using cars (sampling by weight) as an example:
data have;
set sashelp.cars;
run;
*naive macro approach -- not using REPS option;
%macro loop(dsin=,dsoutpfx=,n=);
%do i=1 %to &n. %by 1;
proc surveyselect noprint data=&dsin.
method=pps sampsize=25 out = &dsoutpfx._&i.;
size weight;
run;
%end;
proc sql; drop table tempsortsize; quit;
%mend;
%loop(dsin=have,dsoutpfx=smple,n=5);
* using REPS option;
proc surveyselect noprint data=have
method=pps sampsize=25 reps=5 out = samples;
size weight;
run;
*(splitting if necessary);
data _null_;
do i=1 to 5;
call execute(
cats(
'data smple_',i,';
set samples(where=(replicate=',i,'));
run;'
)
);
end;
run;
-unison
I think you can use the option REPS=100 to obtain the 100 desired samples, they will be in one data set, but you can split them up if necessary (probably not necessary).
Using cars (sampling by weight) as an example:
data have;
set sashelp.cars;
run;
*naive macro approach -- not using REPS option;
%macro loop(dsin=,dsoutpfx=,n=);
%do i=1 %to &n. %by 1;
proc surveyselect noprint data=&dsin.
method=pps sampsize=25 out = &dsoutpfx._&i.;
size weight;
run;
%end;
proc sql; drop table tempsortsize; quit;
%mend;
%loop(dsin=have,dsoutpfx=smple,n=5);
* using REPS option;
proc surveyselect noprint data=have
method=pps sampsize=25 reps=5 out = samples;
size weight;
run;
*(splitting if necessary);
data _null_;
do i=1 to 5;
call execute(
cats(
'data smple_',i,';
set samples(where=(replicate=',i,'));
run;'
)
);
end;
run;
-unison
As suggested by @PaigeMiller and @unison, generate the entire thing using REPS=100. I can't imagine why you would need to split it into 100 chunks but if you really want to do it dynamically in a single step, the hash object is your best friend. Note that below REP=5 is used instead of 100 for the sake of sanity.
data have ;
do i = 1001 to 1050 ;
x = 0 ; b + 2 ; dummy + 1 ;
output ;
end ;
run ;
proc surveyselect noprint data=have out=samples method=pps sampsize=25 reps=5 ;
size b ;
run ;
data _null_ ;
if _n_ = 1 then do ;
dcl hash h (dataset:"samples(obs=0)", multidata:"y", ordered:"a") ;
h.definekey ("replicate") ;
h.definedata (all:"y") ;
h.definedone () ;
end ;
do until (last.replicate) ;
set samples ;
by replicate ;
h.add() ;
end ;
h.output (dataset: "sample" || put (_n_, z3.)) ;
h.clear() ;
run ;
The output data set names are auto-created as sample001, sample002, et al. to make them appear properly sorted by name when viewed in the library. You can change that and/or shape the names the way you want by editing the character expression assigned to the DATASET argument tag in the OUTPUT method call.
Kind regards
Paul D.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.