are Hi,
I am struggling with spilitting my dataset into two based on number of observations.
I have a dataset whose total number of observations can change on pulling out data every month from an access database. The SAS ultimately outputs data to excel file for creating map.
The problem is, I want to split my data into two if my total number of observations is greater than 100.
For eg: something like If No. of observations >=100 then split dataset and make two sets of 50 observation each. OR if no. of observaitons <100 then do nothing.
Can anyone help me with this? I would really appreciate your time.
Thanks,
Another variation version of Art's code.
data have; do i=1 to 102; output; end; run; %macro split(dsn); %let dsid=%sysfunc(open(&dsn)); %let nobs=%sysfunc(attrn(&dsid,nobs)); %let dsid=%sysfunc(close(&dsid)); %let half=%sysevalf(&nobs/2,integer); %if &nobs. gt 100 %then %do; data &dsn.1 &dsn.2; set &dsn.; if _n_ le &half then output &dsn.1; else output &dsn.2; run; %end; %mend split; %split(have)
Ksharp
Hi,
Is this what you want.
data one;
do i=1 to 200;
output;
end;
run;
data want want1 nothing;
set one;
if _n_>100 and _n_<150 then output want;
else if _n_>150 then output want1;
else if _n_<100 then output nothing;
run;
Thanks,
Shiva
Not sure about the efficiency benchmark, but here is another data step approach:
data want want1 nothing;
set one (firstobs=100 obs=150 in=want)
one (firstobs=151 in=want1)
one (obs=99 in=nothing);
if want then output want;
else if want1 then output want1;
else if nothing then output nothing;
run;
Regards,
Haikuo
or
data have;
do i=1 to 110;
output;
end;
data want1 want2;
set have nobs=nobs;
if nobs>100 then do;
if mod(_n_,2)=0 then output want1;
else output want2;
end;
else stop;
run;
hi ... here's another idea that varies the number of data sets produced (d1, d2, etc.) by size of the original data set ...
* make some data (try some different sizes);
data one;
do i=1 to 420;
output;
end;
run;
* if more than 100 obs, divide into data sets;
filename x temp;
data _null_;
file x;
set one nobs=obs;
if obs le 100 then stop;
put 'data';
x = ceil(obs/50);
do j=1 to ceil(obs/50);
put 'd' j;
end;
put '; set one; select;';
do j=1 to ceil(obs/50) - 1;
k = j * 50;
put 'when (_n_ le ' k ') output d' j ';';
end;
put 'otherwise output d' j '; end; run;';
stop;
run;
%include x / source2;
filename x;
I kind of liked Linlin's approach, but I would wrap it in a macro. E.g.:
data have;
do i=1 to 102;
output;
end;
run;
%macro split(dsn);
proc sql noprint;
select count(*) into :nobs
from &dsn.
;
quit;
%if &nobs. gt 100 %then %do;
data &dsn.1 &dsn.2;
set &dsn.;
if mod(_n_,2)=0 then output &dsn.1;
else output &dsn.2;
run;
%end;
%mend split;
%split(have)
Another variation version of Art's code.
data have; do i=1 to 102; output; end; run; %macro split(dsn); %let dsid=%sysfunc(open(&dsn)); %let nobs=%sysfunc(attrn(&dsid,nobs)); %let dsid=%sysfunc(close(&dsid)); %let half=%sysevalf(&nobs/2,integer); %if &nobs. gt 100 %then %do; data &dsn.1 &dsn.2; set &dsn.; if _n_ le &half then output &dsn.1; else output &dsn.2; run; %end; %mend split; %split(have)
Ksharp
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.