Hi,
I have two data sets a and b with 200 entries each. While we have the capacity to test 350 samples each month, I have to use one of the data sets each month (in alternating mode so a b a b etc), and top the remaining places up with samples from the other dataset.
So say, for January, I use dataset a and top it up with 150 samples of dataset b.
In this case I could simply use Proc Sql... obs = 150
However, the samples set sizes change every month.
What do I need to do to :
choose a and fill up samples from sample set b till samples size = 350.
Any ideas are highly welcome.
Many thanks
Here one way to go:
data have_a;
ds='A';
do i=1 to 200;
output;
end;
run;
data have_b;
ds='B';
do i=1 to 200;
output;
end;
run;
%let n_sample=350;
%let ds_all=have_a;
%let ds_sample=have_b;
data want(drop=_:);
/* ds with all records */
do while(not last_A);
set &ds_all nobs=nobs_A end=last_A;
output;
end;
/*** based on: http://support.sas.com/kb/24/722.html ***/
/* Method 3: Using SAS DATA Step with no sort required */
/* Initialize _K to the number of sample obs needed and _N to the */
/* total number of obs in the data set. */
_k= &n_sample - nobs_A;
_n=nobs_A;
do while(1);
set &ds_sample;
/* To randomly select the first observation for the sample, use the */
/* fact that each obs in the data set has an equal chance of being */
/* selected: k/n. If a random number between 0 and 1 is less than */
/* or equal to k/n, we select that the first obs for our sample */
/* and also adjust k and the number of obs needed to complete the */
/* sample. */
if ranuni(0) <= _k/_n then
do;
output;
_k=_k-1;
end;
/* At every iteration, adjust N, the number of obs left to */
/* sample from. */
_n=_n-1;
/* Once the desired number of sample points are taken, stop iterating */
if _k=0 then leave;
end;
stop;
run;
Here one way to go:
data have_a;
ds='A';
do i=1 to 200;
output;
end;
run;
data have_b;
ds='B';
do i=1 to 200;
output;
end;
run;
%let n_sample=350;
%let ds_all=have_a;
%let ds_sample=have_b;
data want(drop=_:);
/* ds with all records */
do while(not last_A);
set &ds_all nobs=nobs_A end=last_A;
output;
end;
/*** based on: http://support.sas.com/kb/24/722.html ***/
/* Method 3: Using SAS DATA Step with no sort required */
/* Initialize _K to the number of sample obs needed and _N to the */
/* total number of obs in the data set. */
_k= &n_sample - nobs_A;
_n=nobs_A;
do while(1);
set &ds_sample;
/* To randomly select the first observation for the sample, use the */
/* fact that each obs in the data set has an equal chance of being */
/* selected: k/n. If a random number between 0 and 1 is less than */
/* or equal to k/n, we select that the first obs for our sample */
/* and also adjust k and the number of obs needed to complete the */
/* sample. */
if ranuni(0) <= _k/_n then
do;
output;
_k=_k-1;
end;
/* At every iteration, adjust N, the number of obs left to */
/* sample from. */
_n=_n-1;
/* Once the desired number of sample points are taken, stop iterating */
if _k=0 then leave;
end;
stop;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.